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ABSTRACT  fCnarttoiM  — f »ar—  alato  If  mm.  _ . 

his  report  describes  the  real-time  video  (RTV)  tracking  system  which  will  be 
deployed  at  White  Sands  Missile  Range  (WSMR)  during  fiscal  year  1979.  This 
tracking  system  utilizes  a distributive  array  of  five  high-speed  microprocessors 
to  Implement  a real-time  tracklno  algorithm  with  sufficient  Intelligence  to  over* 
come  the  tracking  problems  associated  with  typical  noisy  and  cluttered  back- 


come  the  tracking  problems  associated  with  typical  noisy  and  cluttered  back- 
grounds encountered  In  WSMR  tracking  imagery.  Target  positions  relative  to 
boresight  and  target  attitude  angles  are  also  coe|>uted  and  recorded  for  each 
video  field  during  the  tracking  sequence,  thus  eliminating  the  cost  and  del 
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Block  20.  ABSTRACT  (continued) 

Inposed  by  present  systeas  which  require  post-flight  processing  of  film  to 
extract  these  paraaeters. 

A conputer  slaulatlon  of  the  RTV  tracking  systea  has  been  developed  at  USNR  to 
test  the  tracking  and  laage  processing  aTgorlthas  utilized  In  the  systea.  The 
slaulatlon  prograa  Is  descHbed,  along  wlfli  other  research  tools  developed  as 
part  of  the  USNR  laage  processing  laboratory.  F.1nal1y,  significant  results  of 
an  Investigation  of  various  laage  processing  techniques  suitable  for  RTV  track- 
ing are  presented.  ■ 
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PREFACE 


This  report  presents  the  results  of  Image  processing  research  conducted 
at  White  Sands  Missile  Range  (WSMR)  during  the  period  September  1977 
through  ^arch  1978*..  A concise  description  of  the  final  design ' of  Ifie 
real-time  video  tracking  syst^  presently  being  assembled  at  New  Mexico 
State  University  Is  also  included. 

Principal  investigators  contributing  to  this  Image  processing  research 
are  Dr.  M.  K.  Giles  and  Dr.  A.  L.  Gilbert.  Dr.  J.  M.  Taylor  of  NMSU 
helped  with  the  simulation  program,  and  Dr.  R.  Machuca  of  NMSU  provided 
expert  programming  assistance.  Mr.  A.  Garcia  of  WSMR  also  helped  In 
the  operation,  documentation,  and  maintenance  of  the  image  processing 
laboratory.  Mrs.  R.  Granger  of  WSMR  prepared  the  final  report  manu- 
script with  help  from  Mr.  M.  Ramos  In  preparing  the  figures. 
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INTRODUCTION 


A variety  of  Methods  of  Imge  data  processing  have  becoaie  known  over 
the  past  decade.  Much  of  the  effort  has  been' sponsored  by  the  Defense 
Advanced  Research  Projects  Agency  (DARPA)  and  by  NASA  to  further  sci- 
entific understanding  of  Imagery  and  image  processing.  These  agencies 
continue  to  sponsor  Image  understanding  research,  and  the  various  mili- 
tary services  through  their  research  sponsoring  offices  as  well  as  the 
National  Science  Foundation  have  also  become  heavily  Involved  in  pattern 
recognition  and  Image  understanding  research.  Appllcatlons-oriented 
research  at  the  US  Anqy  White  Sands  Missile  Range  (WSMR)  and  at  the 
US  Amv  Night  Vision  Laboratories  (NVL)  has  lead  recently  to  systems 
of  reasonably  high  sophistication  using  concepts  developed  In-house  and 
through  sponsored  research  to  solve  complex  identification  and  tracking 
problems.  WSMR  has  concentrated  on  objects  in  the  visible  spectrum  and 
In  real-time,  while  NVL  has  been  primarily  concerned  with  the  infrared 
and  In  near  real-time.  Many  other  systems,  not  necessarily  real-time, 
have  been  developed  for  applications  In  medicine,  meteorology,  and  space 
research. 

The  development  of  an  Intelligent  real-time  video  (RTV)  tracking  system 
has  been  accomplished  through  the  cooperative  efforts  of  research  and 
development  personnel  at  WSMR,  New  Mexico  State  University  (NMSU),  and 
the  Optical  Sciences  Center  (OSC)  of  the  University  of  Arizona.  The 
prototype  RTV  processor  is  being  assembled  at  NMSU,  the  automatic  zoom 
lens  and  image  rotator  at  the  University  of  Arizona,  and  the  system 
interfaces  at  WSMR.  The  system  components  will  be  integrated  and  the 
system  deployed  early  in  fiscal  year  1979  as  an  add-on  modification  to 
the  Contraves  Model  F cinetheodollte  at  WSMR. 

This  report  contains  several  sections  which  describe  the  RTV  tracking 
system  and  present  the  results  of  research  related  to  the  development 
and  evaluation  of  an  Intelligent  video  tracker.  These  sections  are 
listed  below: 

SECTION  1 . A REAL-TIME  VIDEO  TRACKING  SYSTEM 

This  section  presents  a concise  description  of  the  RTV  tracking  sys- 
tem with  its  distributive  arrty  of  high-speed  processors.  Much  of 

the  Information  for  this  section  was  Stained  from  the  annual  report^ 

submitted  by  NMSU  for  contract  DAAD07-77-C-0046. 


T!  Flachs,  6.  M. , P.  I.  Perez,  R.  B.  Rogers,  S.  J.  Szymanski,  J.  M. 
Taylor,  and  Yee  Hsun  U,  "A  Real-Time  Video  Tracking  System,"  Annual 
Report  for  Contract  DAAD07-77-C-0046,  WSMR,  January  1978. 


SECTION  2.  COMPin'ER  SIMULATION  OF  THE  REAL-TIME  VIDEO  TRACKER 

This  section  describes  the  s1inu1at4on  program  and  Its  utility  In 
both  design  and  evaluation  of  real-time  processing  and  tracking 
algorithms.  Representative  results  are  included. 


SECTION  3.  TOOLS  DEVELOPED  AT  MSMR  FOR  OPTICAL  TRACKING  RESEARCH 

The  hardMare  and  software  which  comprise  the  WSMR  Image  processing 
laboratory  are  described  In  this  section. 

SECTION  4.  RESULTS  OF  WSMR  imGE  PROCESSING  RESEARCH 


The  following  appendices  are  also  attached  to  the  report: 

APPENDIX  I.  "PATTERN  RECOGNITION  AND  REAL-TIME  BORESIGHT  CORREC- 
TION-A  TUTORIAL" 

This  appendix  Is  an  invited  paper  written  by  Alton  L.  Gilbert  and 
presented  at  the  SPIE  Seminar- In- Depth  on  Photo-  and  Electro-Optics 
In  Range  Instrumentation,  13-14  March  1978,  Fort  Walton  Beach, 
Florida. 


APPENDIX  II.  NOVEL  CONCEPTS  IN  REAL-TIME  OPTICAL  TRACKING 

This  appendix  Is  a paper  written  by  Alton  L.  Gilbert  and  Michael  K. 
Giles  for  presentation  at  the  Ant^y  Science  Conference,  20-22  June 
1978,  US  Military  Acadeniy,  Westpoint,  New  York. 
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SECTION  1.  A REAL-TIME  VIDEO  TRACKING  SYSTEM 


An  Intelligent  RTV  tracking  system  will  be  deployed  early  in  fiscal 
year  1979  as  an  add-on  modification  to  the  Cohtraves  Model  F cinetheo- 
dolite  at  WSMR.  The  intelligence  of  the  RTV  tracker  is  contained  in 
trie  RTV  processor.  Figure  1.1  is  a block  diagram  of  the  RTV  tracking 
system  which  shows  the  RTV  processor  as  the  central  element.  The  RTV 
processor  receives  standard  composite  video  from  a television  camera, 
locates  the  target  image,  and  provides  control  signals  which  drive  the 
zoom  and  image  rotator  elements  and  point  the  Contraves  tracking  optics 
at  the  target.  It  also  provides  boresight  correction  signals  and  tar- 
get attitude  angles  which  are  recorded  into  the  vertical  retrace  period 
of  the  video  tape  used  to  record  the  tracking  sequence. 


A RESEARCH-ORIENTED  PROCESSOR  CONFIGURATION 

Each  of  the  four  high  speed  distributive  microprogrammable  processors 
which  comprise  the  RTV  processor  requires  a stored  microprogram  to  con- 
trol its  designated  tracking  function.  To  provide  a powerful  tool  for 
future  research  in  video  tracking  algorithms  and  to  facilitate  opera- 
tional testing  of  the  RTV  system,  the  control  store  of  each  processor 
is  realized  with  a read/write  random  access  memory.  A minicomputer- 
based  input/output  processor  is  used  as  a programmable  interface  among 
the  user,  the  four  distributive  processors,  and  the  video  tape  recorder 
(VTR)  system.  This  flexible,  reprogrammable  structure  is  illustrated 
by  the  input/output  processor  configuration  of  figure  1.2. 

The  input/output  processor  architecture  consists  of  a PDP  11/35  mini- 
computer with  associated  peripherals  and  interfaces  which  allow  the 
user  to  load,  store,  edit,  and  debug  the  control  programs  for  each  of 
the  other  four  distributive  processors;  to  monitor  and  display  the  real- 
time performance  of  the  RTV  system;  and  to  record  the  tracking  data  on 
video  tape  during  each  vertical  retrace  period.  The  Tektronix  4014 
graphics  display  terminal  provides  an  excellent  user  terminal  for  con- 
trolling and  evaluating  the  performance  of  the  RTV  tracking  system. 

The  telephone  modem  and  floppy  disk  provide  the  capability  to  load  and 
store  the  tracking  programs  from  the  computer  system  at  NMSU. 


A STANDARD  MICROPROGRAitlABLE  PROCESSOR  ARCHITECTURE 

The  four  distributive  processors  are  being  built  with  a standard  micro- 
progratnnable  processor  architecture  to  simplify  the  development  and 
maintenance  of  the  RTV  tracking  system.  This  standard  architecture, 
shown  in  figure  1.3,  has  been  designed,  built,  and  tested  at  NMSU. 
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Based  on  the  new  Texas  Instruments  (TI)  74S481  Schottky  processor  chip. 

It  provides  a microinstruction  cycle  time  of  under  200  nanoseconds  with 
sufficient  computational  power  to  implement  the  required  RTV  tracking 
algorithms.  The  standard  architecture  requires  several  LSI  chips  which 
may  be  partitioned  into  control  and  processing  sections. 

The  control  section  consists  of  a Signetics  8X02  microinstruction 
address  sequencing  chip  which  provides  a microprogram  counter  (MPC),  a 
1KX48  rea^write  random  access  memory  which  serves  as  a control  store 
(CS)  for  the  microinstructions  used  to  implement  the  tracking  algo- 
rithms, and  an  edge  triggered  D flip/flop  microinstruction  register 
(MIR)  which  holds  the  currently  executing  microinstruction  while  the 
next  one  is  being  accessed  in  the  CS.  Overlapping  the  execution  of  one 
microinstruction  with  the  fetch  of  the  next  one  allows  the  processor  to 
achieve  a minimum  microinstruction  cycle  time  equal  to  the  larger  of 
either  the  fetch  time  or  the  execution  time,  significantly  increasing 
the  speed  of  the  processor. 

The  processing  section  consists  of  a cascadable  array  of  TI  74S481  4- 
bit  slice  processing  elements,  a register  file  comprising  an  array  of 
high-speed  registers  used  by  the  processor  under  program  control  for 
temporary  storage  of  addresses  and  data,  input  and  output  flag  circuitry, 
and  an  external  bus  system.  The  input  and  output  flags  constitute  a 
set  of  processor  status  flags  which  may  be  tested  under  microprogram 
control.  In  addition,  some  flags  are  used  for  communication  between 
processors  or  for  testing  signals  external  to  the  system. 

The  four  processors  which  comprise  the  RTV  processor  are  described  in 
some  detail  in  the  following  paragraphs.  In  each  case,  the  processor 
is  built  around  the  standard  architecture  described  above.  Some  spe- 
cialized hardware  is  added  to  the  standard  configuration  in  each  case 
to  accommodate  the  specific  functions  of  the  individual  processors. 


THE  VIDEO  PROCESSOR 

The  video  processor  decomposes  each  video  field  into  target,  plume,  and 
background  pixels  at  the  standard  video  rate  of  60  fields  per  second. 

As  the  TV  camera  scans  the  scene,  the  video  intensity  is  digitized  at 
m equally  spaced  points  across  each  horizontal  scan  line.  A resolution 
of  m <=  512  pixels  per  line  results  in  a pixel  rate  of  96  nanoseconds 
per  pixel.  Within  96  nanoseconds,  a pixel  intensity  is  digitized  and 
quantized  into  eight  bits  (256  gray  levels),  counted  into  one  of  six 
256-level  histogram  memories,  and  then  converted  by  the  decision  memory 
to  a 2-bit  code  indicating  its  classification  (target,  plume,  or  back- 
ground). The  2-bit  classification  code  is  passed  to  the  projection 
processor  via  the  target  data  (TD)  and  projection  data  (PD)  lines.  TO 
is  high  for  target  points;  PD  is  high  for  plume  points.  This  sequence 


4 


of  pixel  operations  is  illustrated  by  the  upper  path  of  the  block  dia- 
gram of  figure  1.4. 

The  sync  stripper  (figure  1.4)  provides  a video-derived  sync  signal 
which  synchronizes  the  pixel  clock  at  the  beginning  of  each  video  line 
and  provides  vertical  timing  pulses  for  the  region  definition  logic  and 
the  other  distributive  processors.  The  pixel  clock  provides  the  sample 
clock  for  the  A/D  converter  and  the  horizontal  timing  pulses  for  the 
region  definition  logic.  The  region  definition  logic  converts  the  hori- 
zontal timing  pulses  to  target  and  plume  strobes  (TS  and  PS)  which  are 
transmitted  to  the  projection  processor  with  an  appropriate  delay  for 
pixels  located  within  the  target  and  plume  windows  described  below. 

During  the  vertical  retrace  period  (approximately  1.2  msec)  the  video 
processor  applies  the  decision  algorithm  contained  in  its  control  store 
to  the  stored  histograms  of  the  field  just  digitized  and  then  sets  the 
appropriate  bits  in  each  of  the  256  words  of  decision  memory  to  allow 
real-time  classification  of  the  next  field  of  pixels.  Each  2-bit  word 
of  the  decision  memory  represents  one  of  the  256  possible  levels  of 
pixel  intensity.  During  the  real-time  pixel  classification  operation, 
the  two  bits  of  a given  word  are  connected  to  the  TD  and  PD  lines  when- 
ever a pixel  intensity  corresponds  to  the  address  of  that  word  in  the 
decision  memory. 

Toward  the  end  of  the  vertical  retrace  period,  the  video  processor 
receives  the  computed  location  of  the  top  left  comers,  the  heights  and 
widths  of  the  target  and  plume  tracking  windows  from  the  tracking  proc- 
essor via  the  communications  memory  (figure  1.4),  and  loads  this  data 
into  the  region  definition  logic  for  use  during  the  next  field.  A set 
of  counters  in  the  region  definition  logic  are  preset  with  this  informa- 
tion just  before  digitization  begins  and  decremented  by  the  sync  and 
pixel  clock  signals  to  provide  timing  for  the  vertical  and  horizontal 
extent  of  the  tracking  windows.  Although  the  entire  field-of-view  (FOV) 
of  the  TV  camera  is  digitized  at  60  fields  per  second,  only  those  pixels 
which  lie  within  the  target  and  plume  windows  are  classified,  counted, 
and  processed  by  the  rest  of  the  system. 

The  basic  assumption  of  the  image  decomposition  method  is  that  the  tar- 
get image  has  some  video  intensities  not  contained  in  the  imnediate 
background.  A tracking  window  is  placed  about  the  target  image,  as 
shown  in  figure  1.5,  to  sample  the  background  intensities  inmediately 
adjacent  to  the  target  image.  The  window  frame  is  partitioned  into 
two  '"egions,  B and  P.  Region  B is  used  to  provide  a sample  of  the  back- 
ground intensities,  and  region  P is  used  to  sample  the  plume  intensities 
when  a plume  is  present.  Using  the  sampled  intensities,  a very  simple 
decision  rule  is  used  to  classify  the  pixels  in  region  T as  follows: 
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• Background  po1nts--A11  pixels  In  region  T with  Intensities  found 
In  region  B are  classified  as  background  points. 

• Plune  points— All  pixels  In  region  T with  intensities  found  In 
region  P,  but  not  found  In  region  B,  are  classified  as  plume 
points. 

• Target  points— All  pixels  In  region  T with  Intensities  not  found 
in  either  region  B or  P are  classified  as  target  points. 

The  six  histogram  memories  used  In  the  video  processor  accumulate  Inten- 
sity histograms  of  the  three  regions  (B,  P,  and  T In  figure  1.5)  within 
each  of  two  Independent  tracking  windows.  The  control  store  may  be  pro- 
grammed to  Implement  more  complex  decision  rules,  when  appropriate,  by 
fully  exploiting  the  statistics  of  these  Intensity  histograms.  For 
example,  learned  estimates  of  the  probability  density  functions  for  tar- 
get, plume,  and  background  Intensities  can  be  obtained  from  the  measured 
histograms,  and  a Bayesian  classifier  can  be  used  to  decide  whether  a 
given  pixel  should  be  classified  as  a target,  plume,  or  background  point. 

A tracking  window  placed  about  the  target  Image  provides  a method  for 
sampling  the  pixel  features  associated  with  the  target  and  background 
Images.  The  background  sample  should  be  taken  relatively  close  to  the 
target  Image,  and  It  must  be  of  sufficient  size  to  accurately  character- 
ize the  background  Intensity  distribution  In  the  vicinity  of  the  target. 
The  tracking  window  also  serves  as  a bandpass  filter  by  restricting  the 
target  search  region  to  the  Immediate  vicinity  of  the  target.  Although 
one  tracking  window  Is  satisfactory  for  tracking  missile  targets  with 
plumes,  two  windows  provide  additional  reliability  and  flexibility  for 
Independently  tracking  a target  and  plume,  or  two  targets.  Figure  1.6 
shows  typical  tracking  situations  with  two  tracking  windows.  Having 
two  Independent  windows  allows  each  to  be  optimally  configured  and  pro- 
vides reliable  tracking  when  either  window  can  track. 

If  the  target  to  be  tracked  requires  only  one  window,  then  the  other 
window  can  be  expanded  to  Include  the  entire  FOV  as  shown  In  figure  1.7. 
The  outer  window  provides  additional  reliability  since  It  can  locate 
the  target  Image  as  long  as  It  Is  In  the  optics  FOV.  However,  the 
outer  window  Is  subject  to  more  noise  due  to  Its  larger  size. 


THE  PROJECTION  PROCESSOR 


The  projection  processor  consists  or 
(PAM)  and  a - — 


on  processor  consists  or  a projection  accumulation  memory 
standard  processor  which  are  designed  to  form  projections 


of  simultaneous  target  and  plume  windows  and  to  compute  structural 
parameters  from  the  projections.  The  pixel  data  from  each  tracking 
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window  enters  the  PAM  In  real-time  as  a synchronized  serial  stream  on 
lines  TD  and  PD.  As  the  classified  pixel  data  Is  received,  the  PAM 
accumulates  the  projection  data  while  the  processor  monitors  the  y- 
projectlons,  accumulates  the  total  number  of  target  and  plune  points, 
and  determines  the  midpoints  used  to  split  the  x-projectlons.  Each 
x-projectlon  Is  split  to  allow,  the  computation  of  target  and  plume 
attitude  angles  based  on  the  locations  of  the  median  centers  of  the 
X-  and  y-projectlons  of  the  top  half  and  bottom  half  of  the  target  and 
plume  Images.  In  the  vertical  retrace  Interval,  the  processor  assumes 
addressing  control  of  the  PAM  and  computes  the  structural  parameters 
from  the  projection  data.  A block  diagram  of  the  projection  processor 
Is  shown  In  figure  1.8. 

Each  Incoming  pixel  Is  assigned  a memory  location  In  the  appropriate 
1536  X 9 random  access  memory  (RAM).  A separate  memory  location  Is 
preassigned  for  each  row  and  column  of  each  tracking  window.  Target 
and  plume  windows  containing  up  to  511  x 511  pixels  can  be  accommodated. 
The  first  row  of  pixels  In  a given  window  Is  used  to  clear  the  memory 
locations  assigned  to  the  vertical  columns  of  pixels  to  Initialize  the 
accumulation  of  the  y-projectlon  for  that  row.  Once  the  memory  loca- 
tions for  the  columns  are  cleared,  the  value  of  the  next  pixel  which 
occurs  In  each  column  Is  added  to  zero  and  the  result  Is  written  back 
Into  the  same  memory  location  for  that  column  by  an  arithmetic  logic 
unit  (ALU).  The  value  (1  or  0)  of  each  subsequent  pixel;  I.e.,  the 
value  of  TD  or  PD  for  each  pixel.  Is  added  to  the  contents  of  the  appro- 
priate memory  location  by  the  ALU  used  to  accumulate  the  x-projectlons. 
Similarly,  after  the  memory  location  corresponding  to  a given  horizontal 
row  Is  cleared  by  the  first  pixel  occurring  In  that  row,  the  value  of 
each  subsequent  pixel  occurring  In  the  row  Is  added  to  the  contents  of 
the  memory  location  to  accumulate  the  value  of  the  y-projectlon  for  the 
row.  A separate  ALU  and  high-speed  RAM  are  used  for  each  projection  to 
accumulate  both  horizontal  and  vertical  projections  simultaneously  at 
pixel  rates  of  11  MHz  or  greater. 

The  processing  portion  of  the  projection  processor  uses  the  standard 
processor  with  five  4-b1t  slices  and  an  expanded  register  file  to 
accommodate  targets  or  plumes  consisting  of  more  than  32k  points.  As 
the  PAM  accumulates  the  projections  for  a given  line  of  the  target  win- 
dow, the  total  number  of  target  points  In  that  line  (the  y-projectlon) 

Is  multiplied  by  two  and  stored  by  the  processor.  The  same  Is  true  for 
lines  In  the  plume  window.  All  lines  following  the  first  active  line 
of  the  window  are  processed  In  the  same  manner,  and  the  total  number  of 
target  points  times  two  Is  accianulated.  If  at  the  end  of  a line  this 
number  Is  greater  than  the  total  number  of  target  points  In  the  previous 
frame,  the  top  x-projectlon  Is  terminated.  A flag  Is  sent  to  the  PAM 
forcing  the  x-projectlon  for  the  next  line  to  be  placed  at  the  starting 
address  of  the  bottom  x-projectlon.  Thus,  three  separate  projections 
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are  accumulated  for  each  MlndoM--a  y-projectlon,  and  top  and  bottom 
x-projectlons.  These  six  projections,  stored  In  the  two  1536  x 9 RAMs 
at  the  end  of  each  fields  are  mapped  Into  a continuous  address  space  at 
the  beginning  of  the  vertical  retrace  period  to  alloM  easy  access  by 
the  projection  processor. 

During  the  vertical  retrace  Interval,  the  projection  processor  divides 
each  projection  Into  eight  segments  of  equal  mass  using  a simple  algo- 
rithm to  sequentially  address  each  line  of  the  projection  and  multiply 
the  number  of  pixels  In  the  line  by  eight.  If  the  result  exceeds  the 
total  number  of  pixels  In  the  projection,  a flag  Is  sent  to  the  PAM 
forcing  the  next  line  to  be  placed  at  the  beginning  of  the  next  1/8 
segment  of  the  projection.  If  the  result  Is  less  than  the  total  number 
of  pixels  In  the  projection,  additional  lines  of  pixels  are  accumulated 
until  the  line  containing  the  1/8  percentile  point  Is  located. 

The  A and  B ports  of  the  processor  (see  figure  1.8)  are  used  directly  to 
perform  the  multiplications  required  to  respectively  split  the  x-projec- 
tlons and  find  the  1/8  percentile  points  for  each  projection.  Data  read 
In  on  the  A-bus  Is  multiplied  by  two  with  a hard-wired  shift,  while  data 
read  In  on  the  B-bus  Is  multiplied  by  eight  In  the  same  manner. 

The  1/8  percentile  points  for  each  of  the  six  projections  shown  In 
figure  1.9  are  computed  within  410  usec  of  the  vertical  retrace  period 
and  then  passed  to  the  comnunlcatlon  memory  along  with  the  total  number 
of  target  and  plime  points.  These  parameters  constitute  the  structural 
parameters  used  by  the  tracker  processor  to  define  an  Intelligent  track- 
ing strategy. 


TRACKER  PROCESSOR 

The  tracker  processor  receives  the  structural  parameters  from  the  pro- 
jection processor,  locates  and  characterizes  the  structure  of  the  tar- 
get and  plume  images,  and  decides  on  a tracking  strategy  to  maintain 
track.  It  then  outputs  control  signals  to  place  the  window  frames  In 
the  video  processor  and  outputs  target  location  and  orientation  data 
to  the  control  processor  along  with  a confidence  In  the  measured  data. 
Since  It  operates  on  the  projection  data  from  field  n while  the  projec- 
tions for  the  next  field  (n+1)  are  being  accumulated,  the  tracker  proc- 
essor Is  always  one  field  behind  the  video  and  projection  processors. 
The  tracker  and  control  processors  must  both  finish  their  calculations 
before  the  vertical  retrace  Interval  begins  for  field  n+1.  This  con- 
straint requires  the  tracker  processor  to  output  Its  data  to  the  control 
processor  within  7 milliseconds  after  It  receives  the  projection  data. 

Conceptually,  the  tracking  algorithm  can  be  viewed  as  a finite  state 
Mealy  machine  defined  by  the  following  next  state  and  output  equations: 
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(1.1) 


% ” ^ K-1*  ^n-1*  ^n-2 ^n-k^ 

^n‘  ^n-1*  *••’  ^n-k^ 

^n  “ “ ^Vr  ^n’  ^n-r  •■•’  ^n-k^ 


Here  Is  an  interpretation  of  the  preset  FOV  situation  by  the  track- 
ing algorithm  based  upon  the  previous  interpretation  and  the  sequence  of 
Inputs  ij^,  ....  ijj_|^  corresponding  to  the  input  data  from  the  projection 

processor  for  frames  n,  ....  n-k,  respectively.  The  output  is  the 

tracking  strategy  given  to  the  video  and  control  processors  in  response 
to  the  inputs.  6 and  u represent  the  next  state  and  output  mappings, 
respecti vely. 


A target  with  a plume  is  treated  as  two  distinct  objects  by  the  projec- 
tion processor  which  outputs  two  disjoint  sets  of  data.  The  tracking 
algorithm  processes  the  next  state  mapping  for  each  object  and  deter- 
mines the  best  response  strategy  to  maintain  track.  This  is  illustrated 
by  the  following  machine  equations: 


‘’n  " '^^^‘^n-1’  ^n-1*  *•**  ^n-k^ 

^P,„P  .P  .P  , 

% “ ^Vr  ^n-1’  •••’  ^n-k^ 

^n  " “^‘’n+r  ‘In+l’  ^n’  ^n-k^ 


(1.4) 

(1.5) 

(1.6) 
(1.7) 


Since  the  tracker  processor  is  the  only  processor  that  communicates  with 
all  of  the  other  three  processors,  each  of  which  has  its  own  coordinate 
system,  the  tracker  processor  must  interpret  the  input  data  intelligently 
and  then  output  the  appropriate  data  to  the  video  and  control  processors 
in  their  respective  coordinate  systems.  The  44  inputs  presented  in 
figure  1.10  are  positive  16-bit  integers  defined  for  a coordinate  system 
whose  origin  is  the  first  pixel  scanned  inside  the  appropriate  tracking 
window.  The  eight  outputs  to  the  video  processor  presented  in  figure 
1.11  are  9-bit  positive  integers  defined  for  a coordinate  system  whose 
origin  is  the  first  pixel  scanned  within  the  FOV.  Of  the  seven  16-bit 
outputs  to  the  control  processor  listed  in  figure  1.12,  DX,  DY,  DRX,  and 
DRY  are  integers  defined  for  a coordinate  system  whose  origin  is  the  bore- 
sight.  The  left-most  bit  (bit  15)  is  used  as  the  sign  bit  for  each  word. 
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STATE  Is  also  in  the  Integer  format.  A binary  point  Is  assumed  between 
bits  15  and  14  for  WGHT,  and  between  bits  11  and  10  for  DZ. 

The  angle  from  vertical  boresight  Included  In  figure  1.12  Is  computed 
directly  from  the  projection  data  using  the  projection  median  technique 
Illustrated  by  figure  1.13.  The  projection  median  technique  dissects 
a binary  Image  Into  two  segments.  The  x-  and  y-projectlons  are  accumu- 
lated for  each  segment  and  the  median  coordinates  TXT,  TXB,  TYT,  and 
TYB  defined  In  figure  1.10  are  evaluated  for  each  projection.  The  rota- 
tional angle  with  respect  to  the  y-axis  Is  then  defined  as  the  angle 

between  the  y-axIs  and  the  line  joining  the  two  median  centers. 


®PM  “ TVT  '-  'TYB 

There  are  two  main  advantages  for  using  the  median  center  Instead  of  the 
geometric  centroid  to  determine  the  angle  of  rotation.  First,  the  medians 
are  easier  to  compute,  and  the  projections  used  to  compute  the  medians 
provide  additional  Information;  such  as,  length- to-wldth  ratio  and  tar- 
get shape.  Secondly,  the  median  Is  less  sensitive  to  noise  perturbations 
because  the  distance  of  the  noise  pixels  from  the  target  Is  not  used  In 
the  median  computation. 

With  the  orientation  known,  other  Image  position  Information  such  as  the 
target  tall  or  the  plume  tip  can  be  easily  computed  from  the  projections. 
When  the  target  Is  being  tracked,  the  target  tall  Is  used  to  compute  the 
x-  and  y- displacements  from  boresight.  To  allow  a smooth  transition  from 
target  window  to  plume  window  tracking;  I.e.,  when  the  target  becomes  too 
small  to  track,  the  plume  tip  Is  used  to  compute  the  boresight  displace- 
ments. 

An  overall  view  of  the  functions  of  the  tracker  processor  Is  given  In 
figure  1.14.  It  has  two  modes  of  operation,  the  Initial  acquisition 
mode  and  the  autotrack  mode. 

The  Initial  acquisition  mode  Is  used  when  the  RTV  system  Is  trying  to 
lock  onto  the  target  of  Interest.  During  this  mode,  the  video  processor 
does  little  or  no  learning  on  the  target  and  plume  Intensities.  The 
tracker  processor  will  not  Instruct  the  control  processor  to  begin  pre- 
dicting the  target  location  until  It  Is  sure  of  the  existence  of  at 
least  the  plume  within  the  plume  window.  When  the  plume  Image  moves 
Into  an  appropriate  region  of  the  FOV,  the  tracker  processor  will  notify 
both  the  video  processor  and  the  control  processor  with  a flag  Indicat- 
ing that  It  Is  now  reac(y  to  shift  Into  the  autotrack  mode. 

The  autotrack  algorithm  Is  divided  Into  the  four  main  modules  shown  In 
figure  1.14.  The  data  conversion  module  transforms  the  projection  Input 
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data  Into  physical  variables;  such  as,  target  and  plune  size,  position, 
and  shape.  These  variables  are  then  confined  with  previous  target 
activity  data  from  the  history  update  module  to  obtain  additional  varia- 
bles; such  as,  the  changes  In  target  and  plume  position  and  size.  All 
of  these  variables  are  compared  with  preassigned  reference  constants  to 
obtain  a set  of  binary  Inputs  which  are  used  directly  by  the  state  Inter- 
pretation module  (see  table  1.1)  to  define  the  current  tracking  situation 
and  produce  an  optimum  tracking  strategy.  The  strategy  Is  Implemented 
by  the  output  computation  module  In  the  form  of  the  control  signals  to 
the  video  and  control  processors  described  In  figures  1.11  and  1.12. 

The  tracking  situations  of  both  the  target  and  plune  Images  are  described 
by  four  states  which,  when  combined,  form  a total  of  16  system  tracking 
states.  Table  1.2  lists  the  four  Image  states  and  eight  examples  of  the 
system  tracking  states. 

The  basic  hardware  architecture  of  the  tracker  processor  consists  of  the 
standard  TI  74S481  microprogrammable  processor  and  a 2k  x 16  RAM.  This 
working  memory  contains  the  various  lookup  tables,  data  storage  area, 
and  the  scratch  pad  area. 

It  Is  highly  desirable  to  keep  the  control  store  of  the  tracking  processor 
within  Ik  because  of  the  10  bits  address  bus  of  the  8X02  control  sequencer. 
This  constraint  heavily  Influenced  the  basic  design  and  Implementation  of 
the  tracking  processor  algorithm,  whose  computation,  though  rather  simple 
and  straightforward.  Is  lengthy.  Lookup  tables  are  used  whenever  pos- 
sible to  simplify  the  computations  and  reduce  the  microcode.  At  a cycle 
time  of  less  than  200  nanoseconds,  the  standard  processor  1s  capable  of 
executing  over  35k  Instructions  within  the  7-m1111second  time  limit  which 
is  more  than  enough  for  the  current  algorithm. 

The  state  Interpretation  module  Is  Implemented  as  a lookup  table  because 
of  Its  finite  state  machine  approach.  The  target  Inputs  are  encoded  Into 
10  bits  as  in  the  example  of  table  1.1,  and  the  plume  inputs  are  encoded 
similarly  into  7 bits.  By  using  the  Inputs  as  an  address,  the  state 
Interpretation  module  Is  Implemented  as  a Ik  by  16  lookup  table  In  the 
working  memory.  A state  transition  is  effected  when  the  set  of  target 
inputs  for  a given  pixel  field  addresses  the  proper  memory  location  in 
the  lookup  table.  The  optimum  next  state  for  that  set  of  target  Inputs 
Is  read  out  of  the  field  of  2 bits  selected  by  the  current  target  state 
as  shown  In  figure  1.15.  Similarly,  the  set  of  plune  Inputs  and  the  cur- 
rent plume  state  are  used  to  select  the  optimum  next  plume  state,  and 
the  new  plume  and  target  states  are  combined  to  define  the  optimum  track- 
ing state  to  be  Implemented  during  the  next  video  field. 

The  zoom  correction  algorithm  Is  also  computed  by  means  of  a lookup  table. 
The  use  of  lookup  tables  for  Implementing  algorithms  has  many  advantages. 
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TABLE  1.1.  EXAMPLE  BINARY  TARGET  INPUT  DATA  TO  THE  STATE  INTERPRETATION 
NODULE 


Bits 

0 

1 and  2 
3 and  4 

5 

6 

7 

8 
9 


Binary  Values 
1 

10 

10 

1 

1 

1 

1 

1 


Target  Condition 
Target  too  long 
Too  many  target  points 
Target  size  Increased  too  much 
Correct  target  shape 
Random  target  motion 
Target  within  field-of-vlew 
Trackable  target  location 
Drastic  change  of  target  location 


TABLE  1.2.  THE  FOUR  IWGE  STATES  AND  EIGHT  OF  THE  16  SYSTEM  TRACKING 
STATES 


Image  States 

00 

01 

10 

11 


Description 

Normal  tracking 
Abrupt  change 
Out  of  FOV 
Lost 


State  Target  Plume 


Description 


SI 

0 

0 

1 0 

0 

Normal  tracking  of  both  Images 

S2 

0 

0 

: 0 

1 

Target  tracking,  abrupt  plume  change 

S3 

0 

0 

'1 

0 

Target  tracking,  plume  out  of  FOV 

S4 

0 

0 

1 

1 

Target  tracking,  plune  lost 

S5 

1 

1 

0 

0 

Target  lost,  plune  tracking 

S6 

1 

1 

0 

1 

Target  lost,  abrupt  plume  change 

S7 

1 

1 

jl 

0 

Target  lost,  plume  out  of  FOV 

S8 

1 

1 

1 

1 

Target  lost,  plume  lost  (use  trajectory  data) 
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It  significantly  reduces  the  amount  of  required  control  store,  but  more 
Importantly,  It  allows  the  next  state  mapping:  hence,  the  response 
strategy,  to  be  easily  modified  to  suit  other  forms  of  Images  without 
changing  the  basic  algorithm  In  the  microcodes. 


THE  CONTROL  PROCESSOR 

The  function  of  the  control  processor  Is  to  generate  the  four  control 
signals  that  drive  the  real-time  video  tracker;  I.e. , the  tracker  azimuth 
and  elevation  E^  which  are  sent  to  the  RTV-Contraves  system  Interface 

and  the  optics  rotation  4)^  and  zoom  which  are  sent  to  the  RTV-zoom/ 

rotation  Interface  (figure  1.1).  In  addition,  the  control  processor  out- 
puts the  following  tracking  data  to  the  Input/output  processor  after  each 
field  so  that  they  can  be  recorded  In  the  vertical  retrace  period  of  the 
video  tape:  field  count,  tracker  status,  time,  x-dlsplacement  from  bore- 
sight,  y-dlsplacement  from  boresight,  tangent  of  the  target  orientation 
angle  from  vertical  boresight,  target  azimuth,  target  elevation,  tracker 
azimuth,  tracker  elevation.  Image  rotation  angle,  and  zoom  ratio. 

The  tracking  optics  feeds  the  target  Image  to  the  video  processor  portion 
of  the  RTV  processor  (figure  1.1)  which  establishes  the  target  coordinates 
with  respect  to  the  optics  boresight.  The  control  processor  combines 
current  target  coordinates  with  previous  target  coordinates  to  point  the 
optics  toward  the  next  expected  target  position.  Radar  derived  tracking 
data  Is  also  available  to  the  control  processor  from  the  WSMR  precision 
acquisition  system  (PAS). 

There  are  several  Inherent  differences  between  the  optical  and  the  PAS 
data.  First,  the  optical  system  tracks  the  target  In  real-time  with  60 
updates  per  second  using  a 2-d1mens1ona1  pointing  vector,  while  the  PAS 
tracks  the  target  In  3-d1mens1ona1  space  with  a 200  millisecond  delay 
and  20  updates  per  second.  Secondly,  when  the  target  Is  visible,  the 
optics  data  should  be  an  order  of  magnitude  more  precise  than  the  PAS 
data.  The  PAS  data,  however.  Is  available  when  the  optical  data  Is  lost 
due  to  clouds  or  poor  visibility.  The  control  processor  must  take  these 
differences  Into  consideration  and  generate  the  "best"  estimate  control 
signals  to  point  the  optics  towards  the  target.  The  confidence  weight 
WGHT  and  the  tracking  STATE  Inputs  from  the  tracker  processor  allow  the 
control  processor  to  properly  weight  the  optical  data  1n  the  control 
equations.  Since  the  optical  data  Is  generally  more  precise.  It  Is  used 
as  the  primary  tracking  data.  A manual  override  capability  Is  also  pro- 
vided to  reacquire  track  If  necessary. 

Using  the  target  position  (DX,  DY),  target  rotation  (DRX,  DRY),  and  zoom 
correction  (OZ)  Inputs  from  the  tracker  processor  (figure  1.12)  together 
with  the  actual  tracker  position  (A  , E ),  Image  rotation  (i^^),  and  zoom 
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ratio  (Zq)  from  encoders  on  the  tracking  mount  (figure  1.1),  Image  rota- 
tor and  zoom  lens  respectively,  the  control  processor  computes  measured 
values  for  the  four  control  signals  (Aj^,  Ej^,  2^).  The  following 

measured  control  equations  are  Implemented  by  the  control  processor: 

Zm  = 


X and  Y are  the  dimensions  In  pixel  units  of  the  video  field. 

The  predicted  control  equations  are  based  on  the  combination  of  linear 
and  quadratic  optical  estimates  taken  from  a five-deep  history  stack, 
as  well  as  any  available  PAS  derived  estimates.  Since  the  Input  data 
Is  derived  from  field  (K-1),  and  the  estimates  are  being  computed  during 
field  K,  the  control  estimates  must  predict  ahead  two  time  increments 
to  provide  control  signals  which  will  place  the  boresight  at  the  correct 
position  during  frame  K-t-1.  Thus  the  linear  optical  estimates  have  the 
form 


Zo  (1  + DZ) 


♦o  ♦ (^) 

Eq  + L-f  cos  *0  - -y  sin  ♦p]  -j- 

0 

. . rOX . . DY  FOV 

Ao  + l-j  cos  ♦o  + sin  ♦jj]  z-cSTE" 

0 m 


(1.9) 

(1.10) 

(1.11) 

(1.12) 


e(K+l|K-l)L  = 3e(K-l)  - 20(K-2)  (1.13) 

and  the  quadratic  optical  estimates  have  the  form 

6(K+1|K-1)q  » ^ [8l0(K-l)-2O0(K-2)-2l0(K-3)+12e(K-4)-279(K-5)]  (1.14) 

A linear  combination  of  the  linear  and  quadratic  predictors  (equation 
1.15)  Is  used  to  predict  the  target  location. 

eOPT(K+l)  “ a,e(K+1|K-1)L 

The  linear  coefficients  and  02  are  determined  on  the  basis  of  minimum 

variance  of  errors  made  by  the  predictors  on  their  estimate  of  the  pre- 
vious video  field. 
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When  PAS  data  Is  available. 


epAs(K+l)  is  obtained  using  the  5-point  quad- 


ratic predictor  (equation  1.14)  at  the  video  field  rates,  and  the  linear 
minimum  variance  predictor  of  equation' 1 .16  is  used  to  predict  the  target 
location. 


The  coeffieients  3^  and  @2  determined  on  the  basis  of  minimum  vari- 
ance of  errors  made  by  the  optical  and  PAS  predictors  during  the  previous 
field. 

The  above  estimates  are  computed  for  each  of  the  four  control  signals. 
These  signals  are  then  passed  as  inputs  A^ , E^-,  (t>^ , and  to  the  control 

systems  which  respectively  drive  the  Contraves  mount,  the  image  rotator, 
and  the  zoom  lens  (see  figure  1.1).  The  control  processor  calculations 
are  performed  in  floating-point  to  provide  the  accuracy  required  by  the 
respective  control  systems.  Hence,  the  fixed  point  and  integer  inputs 
accepted  by  the  control  processor  are  converted  to  floating-point,  the 
calculations  are  performed,  and  then  the  control  signals  are  converted 
back  to  fixed  point  numbers  which  are  sent  to  the  appropriate  interfaces. 

The  need  for  a floating-point  multiply  and  divide  capability  strongly 
influenced  the  decision  to  use  the  TI  74S481  as  the  standard  processor 
chip.  A standard  processor  with  four  4-bit  slices  is  used  in  the  con- 
trol processor  architecture  to  provide  a 16-bit  word  length  which  is 
adequate  to  maintain  tracking  control.  Angle  data  transferred  in  and 
out  of  the  interfaces  are  in  integer  format  16-bit  circular  binary. 

The  most  significant  bit  has  a weight  of  180®,  the  least  significant 
bit  has  a weight  of  about  20  seconds  of  arc.  Intermediate  calculations 
are  performed  in  floating-point  format  with  16  bits  for  the  fraction  and 
16  bits  for  the  exponent. 

The  control  store  size  for  the  control  processor  microprogram  is  pres- 
ently estimated  to  be  about  900  48-bit  control  words.  This  does  not 
include  the  lookup  tables  for  the  required  function  calculations.  The 
function  lookup  tables  will  reside  in  RAM.  These  tables  require  an 
additional  512  words. 

Figure  1.16  shows  the  random  access,  control  store,  and  I/O  memory  re- 
quired by  the  control  processor.  Also  shown  is  the  direction  of  data 
flow.  In  this  figure,  it  is  assumed  that  the  function  tables  for  the 
SIN(X),  COS(X),  and  ATAN(Y,X)  microroutines  used  in  the  control  equa- 
tions 1.9  to  1.12  will  reside  in  random-access  memory. 
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Figure  1.2.  The  Input/Output  Processor  Configuration 
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Figure  1.3.  Standard  Processor  Architecture 
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Other  Processors 


Figure  1.4.  Video  Processor  Block  Diagram 
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Figure  1.8.  Projection  Processor 
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Figure  1.9.  Projection  Memory  Addressing 


22 


PLUME  DATA  TARGET  DATA 


( Format  - Nonnegative  Integers  ) 

Number  of  target  points  / 4 
Number  of  plume  points  / 4 

1/8  percentile  points  for  target  top  segment  x-projectlon. 
TXT  ■ TXPT(4)  ■ median  x-coordinate  of  the  target  top 


1/8  percentile  points  for  target  bottom  segment  x-projectlon. 
TXT  * TXPB(4)  ■ median  x-coordInate  of  the  target  bottom 


1/8  percentile  points  for  target  y-projectlon. 

TYT  ■ TYP(2)  * median  y-coordinate  of  the  target  top 
TYB  = TYP(6)  ■ median  y-coordInate  fo  the  target  bottom 


1/8  percentile  points  for  plume  top  segment  x-projectlon. 
PXT  = PXPT(4)  = median  x-coordinate  of  the  plume  top 


1/8  percentile  points  for  plume  bottom  segment  x-projectlon. 
PXB  = PXPB(4)  ■ median  x-coordinate  of  the  plume  bottom 


1/8  percentile  points  for  plume  y-projectlon. 

PYT  = PYP(2l  ■ median  y-coordinate  of  the  plume  top 
PYB  ■ PYP(6)  ■ median  y-coordinate  of  the  plume  bottom 


Octal 

Address 


Figure  1.10.  Input  Data  from  Projection  Processor 
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8x9  ( Format  - Nonnegative  Integers  ) 


X & y coordinates  of  target  window 


X & y dimension  of  target  window 


X & y coordinates  of  plume  window 


X & y dimensions  of  plume  window 


TARGET  WINDOW  DATA 


PLUME  WINDOW  DATA 


Figure  1.11.  Output  Data^tg_yi^o^PrpcfssorJ[li 
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DX  x-dl spl acement  from  boresight 
DY  y-dl spl acement  from  boresight 
DZ  zoom  correction 


angle  from  vertical  boresight  = arctan 


STATE  tracking  state 
WGHT  tracking  weight 


Figure  1.12.  Output  Data  to  Control  Processor  (CP) 
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Figure  1,14.  Tracker  Processor  Functions 


Figure  1.15.  Organization  of  the  State  Interpretation 
Lookup  Table 
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Figure  1.16.  Control  Processor  Memory  Allocation 
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SECTION  2.  COMPUTER  SIMULATION  OF  THE  REAL-TIME  VIDEO  TRACKER 


A computer  simulation  of  the  RTV  tracking  system.  Incorporating  the  algo- 
rithms used  In  the  control  stores  of  the  four  distributive  processors, 
has  been  developed  and  Implemented  on  the  PDP  11/35  system  at  USMR.  A 
preliminary  version  of  this  program  Is  described  in  detail  In  the  1976 
NMSU  final  report.^  The  purpose  of  this  simulation  is  to  provide  a 
method  for  testing  new  design  concepts  and  evaluating  the  RTV  tracking 
system  under  realistic  tracking  conditions.  The  simulation  model 
Includes  dynamic  models  for  the  target  trajectory  and  the  Contraves  Model 
F cinetheodollte  tracking  system.  In  addition  to  the  RTV  processor  algo- 
rithms to  simulate  the  complete  tracking  system. 

The  simulated  target  Is  Initially  located  at  the  launch  site,  L = (0,  0, 
0),  and  Its  trajectory  Is  computed  In  three  dimensions  from  acceleration 
profiles  (Ax,  Ay.  Az)  In  the  x,  y,  z orthogonal  directions.  The  tracker 

Is  located  at  T = (Tx,  Ty,  Tz),  and  it  is  initially  positioned  so  that 
the  target  will  pass  through  the  FOV  of  the  tracking  optics  (see  figure 


SIMULATION  OUTPUTS 

The  simulation  model,  written  In  FORTRAN  IV,  utilizes  the  Tektronix  4014 
graphics  terminal  to  display  the  performance  of  the  tracking  system. 

Figure  2.2  shows  the  output  from  video  field  5 of  the  simulation.  The 
target  and  plume  Images  are  plotted  as  they  are  seen  by  the  video  proc- 
essor superimposed  on  a crosshair  which  locates  the  boresight  of  the 
tracking  optics.  The  control  processor  estimates  the  location  of  the 
target  for  the  next  frame  and  provides  the  control  signals  to  place  the 
camera  boresight  on  the  target.  The  performance  of  the  control  processor  ^ 

can  be  evaluated  by  observing  the  target  position  relative  to  the  camera 
boresight. 

In  addition  to  the  digitized  target  and  plume  Images,  the  simulation  dis- 
plays the  target  and  plume  tracking  windows  and  the  projections.  The 
actual  and  predicted  target  location  and  orientation,  the  current  tracker 
boresight  location  and  Image  rotation,  the  target  view  angle,  and  the 
zoom  ratio  are  also  written  on  the  display  terminal  along  with  the  time, 
range,  and  Index  number  of  the  displayed  field.  As  figure  2.2  Illus- 
trates, these  outputs  provide  a direct  measure  of  the  performance  of 


T.  t^tachs,  6.  M. , U.  E.  Thompson,  R.  J.  Black,  J.  M.  Taylor,  U.  Cannon, 
and  Yee  Hsun  U,  "A  Pre-Prototype  Real-Time  Video  Tracking  System,"  Final 
Report  for  Contract  QAAD07-76-C-0024,  HSMR,  January  1977. 


each  of  the  four  processor  algorithms  as  well  as  an  overall  measure  of 
the  tracking  accuracy  of  the  system.  Optional  graphics  outputs  designed 
to  enhance  the  initial  user  setup  of  the  simulation  include  a 3-dimen- 
sional plot  of  the  missile  trajectory  and  plots  of  the  dy:»amic  tracker 
azimuth,  elevation,  rotation,  and  zoom  response  for  perfect  missile 
trajectory  data.  Thus,  the  user  can  visualize  how  the  tracker  should 
respond  and  determine  which  tracking  errors  can  be  attributed  to  tracker 
clynamics  limitations.  Figure  2.3  is  a 3-dimensional  plot  of  the  first 
100  fields  of  the  trajectory  presently  being  used  in  the  simulation.  It 
presents  a challenging  tracking  sequence  to  the  RTV  system  with  its 
rapidly  changing  direction  and  aspect. 

In  addition  to  its  graphics  display  capabilities,  the  simulation  program 
can  dump  the  tracking  states  of  up  to  40  selected  simulated  tracking 
fields  onto  a floppy  disk,  thus  enabling  the  user  to  restart  the  simula- 
tion at  $"■’  'ted  fields  in  the  tracking  sequence.  Since  the  POP  11/35 
requires  approximately  2 minutes  to  calculate  the  states  for  each  image 
field,  this  option  offers  a tremendous  savings  in  time  when  the  user 
desires  to  observe  the  central  or  final  portions  of  a tracking  sequence. 


SIMULATION  INPUT  DATA  AND  RESULTS 

Two  types  of  input  are  available  to  the  simua1ation--simu1ated  digitized 
video  fields  and  actual  digitized  video  fields  from  video  tape  record- 
ings of  typical  WSMR  tracking  sequences.  The  simulated  video  data  is 
produced  by  a sophisticated  picture  generator  subroutine  which  generates 
pixel  intensities  belonging  to  target,  plume,  background,  and  foreground 
intensity  distributions.  The  characteristics  of  these  distributions 
were  established  through  studies  of  cinetheodolite  film  sequences  at  WSMR. 
The  picture  generator  not  only  produces  simulated  video,  but  it  also 
projects  the  target  and  plume  images  onto  a plane  normal  to  the  tracker 
boresight  to  simulate  the  proper  perspective  of  the  target  and  plume  as 
seen  through  the  tracking  optics. 

Figure  2.4  contains  two  representative  simulation  outputs  selected  from 
the  first  100  fields  of  simulated  digitized  video.  The  missile  trajec- 
tory is  the  one  presented  in  figure  2.3.  It  is  evident  that  the  RTV 
tracker  performs  very  well  during  this  tracking  sequence.  Track  is  main- 
tained throughout  the  sequence,  and  all  processors  appear  to  be  function- 
ing properly. 

The  recent  development  of  an  image  processing  laboratory  at  WSMR  has 
enabled  research  personnel  to  digitize  sequential  video  fields  of  typical 
tracking  imagery.  These  fields  of  digitized  video  are  now  being  used  in 
the  RTV  simulation  and  in  the  development  of  improved  image  segmentation 
and  structural  analysis  algorithms. 
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The  sliMilatlon  prograa  has  been  Modified  to  allow  these  digitized  Images 
to  be  Input  directly  to  the  video  processor  In  place  of  the  simulated 
Intensity  values.  This  effort  Is  Just.beginning.  Much  work  remains  to 
be  accomplished  both  In  the  modification  and  Improvement  of  the  simula- 
tion program,  and  In  the  testing  of  the  RTV  processor  algorithms  with 
actual  digitized  data.  The  preliminary  results,  however,  are  very 
encouraging. 

Figure  2.5(a)  Is  a graphics  dlsplayr  of  a field  of  digitized  video  which 
was  Inject^  Into  the  RTV  simulation.  The  simulation  was  allowed  to 
repeatedly  process  the  same  video  field  superimposed  on  the  simulated 
target  trajectory  for  several  frames  to  test  the  static  and  dynamic 
responses  of  the  RTV  processors.  The  result  after  six  processing  frames, 
shown  In  figure  2.5(b)  verifies  the  effectiveness  of  the  processing  algo- 
rithms used  In  the  RTV  tracker.  Since  no  pltme  Is  present,  both  windows 
are  tracking  the  target.  Also,  Image  rotation  has  not  been  Incorporated 
In  the  simulation,  but  rather  the  tracking  window  Is  allowed  to  rotate 
and  align  Itself  with  the  target.  Several  Important  conclusions  may  be 
deduced  from  this  I'esult:  The  video  processor  successfully  Identifies 
most  of  the  target  pixels;  the  projection  data  Is  being  accumulated  and 
used  effectively  by  the  projection  and  tracker  processors  to  obtain  the 
target  orientation  and  the  displacement  of  the  target  from  boresight; 
and  both  windows  are  tracking  the  target  and  closing  down  on  It  to 
reduce  noise. 
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Figure  2.2.  Simulation  Output  at  Field  5 
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Figure  2.3.  Three-Dimensional  Plot  of  Simulated  Missile  Trajectory 
(for  the  first  100  video  fields) 


Figure  2.4.  Simulation  Outputs 


SECTION  3.  TOOLS  DEVELOPED  AT  WSMR  FOR  OPTICAL  TRACKING  RESEARCH 


THE  IMAGE  PROCESSING  LABORATORY 

An  Image  processing  laboratory  has  been  developed  by  the  Advanced  Tech- 
nology Office,  Instruaentatloh  Directorate,  HSMR,  to  support  ongoing 
real-time  optical  tracking  research.  The  Image  processing  system  Is 
built  around  a PDP  11/35  minicomputer  with  two  floppy  disk  drives,  a 
Tektronix  4014  graphics  display  terminal,  and  a Tektronix  4631  hardcopy 
unit.  Peripheral  support  equipment  Includes  a Sony  VO-2850  U-Matic 
Videocassette  recorder,  two  Sony  TV  monitors,  an  IPS  video  disk,  a 
Bright  7-track  write-only  magnetic  tape  drive,  several  TV  cameras,  a 
Consolidated  Video  Systems  504  digital  video  signal  corrector,  and  a 
custom-built  ISI  digitizer.  A Printronix  line  printer  Interfaced  to 
a PDP  LSI-11  Is  also  available  In  an  adjacent  laboratory  for  displaying 
digitized  picture  files. 

The  present  system  configuration  Is  shown  In  figure  3.1.  A video  tape 
containing  a typical  tracking  sequence  Is  played,  or  a TV  camera  Is 
used  to  monitor  a test  mission.  When  an  Interesting  tracking  situation 
occurs,  the  video  disk  record  button  Is  pushed  causing  a sequence  con- 
taining from  one  to  300  TV  frames  to  be  stored  on  the  disk.  Either 
video  field  of  each  TV  frame  stored  on  the  video  disk  may  be  selected 
for  digitization  by  the  computer  software  which  Is  Input  from  a floppy 
disk.  The  digitizer  converts  the  video  field  Into  a 512  x 240  (or 
256  X 240),  8-b1t  pixel  field.  Thus,  each  of  the  122,880  (or  61,440) 
pixels  Is  quantized  Into  one  of  256  possible  Intensity  levels,  and  this 
array  of  Intensity  values  Is  stored  on  a floppy  disk.  The  pixels  are 
stored  In  512  (or  256)  columns  with  240  pixels  per  column.  The  data 
Is  transposed  to  a row- structured  file  before  the  Image  Is  processed. 

The  video  signal  corrector  Is  used  to  produce  the  proper  peak  video 
amplitude  and  phase  required  by  the  video  disk  and  the  digitizer. 

A library  of  20  digitized  Images  Is  presently  available  for  processing 
at  WSMR.  These  Images  contain  targets  and  backgrounds  which  are  repre- 
sentative of  those  to  be  encountered  by  the  real-time  video  tracker. 
Figure  3.2  presents  several  examples  from  this  library  of  Images.  A TV 
camera  Is  also  being  mounted  on  a cinetheodollte  at  WSMR  to  obtain  video 
sequences  from  the  optical  system  on  which  the  real-time  video  tracker 
will  be  deployed  during  fiscal  year  1979.  Sequential  video  frames  from 
the  cinetheodollte  system  will  soon  be  added  to  the  current  library  of 
Images.  This  library  has  provided  the  basic  Image  data  needed  to  evalu- 
ate the  effectiveness  of  mary  novel  Image  processing  algorithms  Includ- 
ing those  to  be  deployed  with  the  real-time  video  tracker  as  well  as 
several  promising  new  techniques  presently  being  Investigated  by  research 
personnel  at  WSMR.  With  the  addition  of  sequential  digitized  frames  of 
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video  to  the  library  of  Images,  the  entire  target  tracking  loop  can  be 
evaluated  using  the  real-time  video  tracker  simulation  at  WSMR.  Research 
personnel  at  various  laboratories  throughout  the  world  will  find  this 
library  of  Images  to  be  a useful  source  of  date  to  evaluate  future  Intel- 
ligent optical  tracking  techniques. 

The  magnetic  tape  drive  Is  used  to  transfer  Image  files  to  7-track  tape 
for  distributing  the  library  of  digitized  Images  to  Interested  research 
Installations.  Software  has  been  developed  at  WSMR  which  allows  these 
tapes  to  be  formatted  for  compatibility  with  most  computer  systems.  In 
addition,  the  data  can  also  be  transferred  to  9-track  tape  or  to  compu- 
ter cards  If  necessary. 

The  main  software  programs  developed  and  being  used  In  the  WSMR  Image 
processing  laboratory  are  listed  In  table  3.1  together  with  a brief  des- 
cription of  the  purpose  of  each  program.  These  programs  are  used  to 
digitize,  transpose,  trim,  process,  and  display  Images;  plot  gray  level 
histograms;  and  copy  Images  to  magnetic  tape  for  distribution  to  other 
laboratories. 


THE  RTV  SIMULATION  AND  THE  RTV  TRACKER  AS  RESEARCH  TOOLS 

The  RTV  simulation  Is  being  used  as  a research  tool  at  WSMR.  It  Is 
especially  effective  In  evaluating  the  RTV  system  performance  and  In 
Identifying  and  seeking  solutions  to  real-time  tracking  problems  before 
the  RTV  tracking  system  Is  deployed.  With  the  added  capability  of  using 
digitized  video  from  a variety  of  tracking  sequences  as  Inputs  to  the 
video  processor,  the  simulation  can  now  test  the  system  performance  under 
a variety  of  tracking  conditions,  thus  allowing  thorough  evaluation  and 
possible  refinement  of  the  tracking  and  processing  algorithms  and  the 
state  transitions  of  the  tracker  processor. 

When  combined  with  the  WSMR  Image  processing  laboratory,  the  simulation 
becomes  a powerful  aid  In  evaluating  the  performance  of  novel  Image 
filtering  algorithms.  Images  can  be  digitized  and  processed  using  a 
variety  of  filtering  techniques.  The  effectiveness  of  the  filter  algo- 
rithms can  then  be  evaluated  by  comparing  the  simulation  results  obtained 
when  original  and  processed  Image  fields  are  used  as  Inputs  to  the  video 
processor. 

The  simulation  can  also  be  used  to  test  new  design  concepts.  Since  It 
Is  Implemented  with  a modular  structure,  the  processing  and  tracking 
algorithms  can  be  changed  easily,  allowing  new  algorithms  to  be  evalu- 
ated within  the  framework  of  the  complete  tracking  system.  For  example, 
features  other  than  Intensity  could  be  derived  from  the  relationships 
between  sampled  pixels  In  the  three  regions  of  the  target  tracking  win- 
dow. Possible  candidates  Include  texture,  gradient,  and  linearity 
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TABLE  3.1  USMR  IMGE  PROCESSING  SOFTWARE 


Program 

Language 

Purpose 

LP0RI6 

MACRO 

Grey  level  display  of  digitized  picture  data  on 

LPSWU. 

MACRO 

a Printronix  line  printer 

LPLARG 

MACRO 

WRP256 

FORTRAN 

Gray  level  display  of  digitized  picture  data  on 

WRP512 

FORTRAN 

a Tektronix  4014  display  terminal. 

HISTP 

FORTRAN 

Histogram  plot  of  the  number  of  pixels  at  each 
of  the  256  gray  levels  In  an  entire  digitized 

Image. 

DHIST 

FORTRAN 

Histogram  plot  and  gray  level  display  of  a speci- 
fied portion  of  a digitized  Image. 

7R256 

FORTRAN 

Digitized  Image  transpose  from  column  structured 

TR512 

FORTRAN 

to  row  structured  pixel  storage. 

TRIM 

FORTRAN 

Selection  and  storage  of  a specified  portion  of 
a digitized  Image. 

DIG5F1 

MACRO 

Digitization  of  field  1 or  field  2 of  a video 

DIG5F2 

mCRO 

frame  Into  a 512  x 240  or  256  x 240  pixel  field. 

DIG2F1 

MACRO 

DIG2F2 

MACRO 

MAG256 

mCRO 

Transfer  of  digitized  image  files  from  floppy 

MAG512 

mCRO 

disk  to  7- track  magnetic  tape. 

FILTER 

FORTRAN 

Selected  filtering  of  digitized  Images.  Filter 
types  Include  average,  median,  human  visual, 
sobel,  maxmln,  and  moment. 

COPY 

FORTRAN 

Copying  and  formatting  of  magnetic  tapes  for 
distribution. 

PICTPR 

FORTRAN 

Target  boundary  extraction  from  Image  data. 

STRUCT 

FORTRAN 

Structure  analysis  of  target  boundary. 
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measures,  each  of  which  could  be  Implemented  by  modifying  the  Image 
decomposition  algorithm. 

Once  deployed,  the  RTV  tracking  system  will  become  a versatile  research 
tool  with  the  tremendous  advantage  of  real-tline  processing  of  video 
fields.  The  research  and  eva.1uat1on  capabilities  of  the  simulation  are 
also  present  In  the  RTV  tracker  system,  but  with  the  added  capability 
of  obtaining  Imnedlate  results.  Each  of  the  four  processors  will  have 
a Ik  X 48  writable  control  store  for  Implementing  the  tracking  algo- 
rithm, and  the  Input/output  processor  will  facilitate  the  reprogramnlng 
of  the  control  stores  with  new  algorithms.  Additional  control  stores 
can  be  Implemented  on  an  extender  card  If  future  tracking  algorithms 
require  more  than  Ik  of  microcode. 
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Figure  3.2.  Example  of  Digitized  Images  from  the  WSMR  Image  Library 
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SECTION  4.  RESULTS  OF  WSMR  imGE  PROCESSING  RESEARCH 


The  developaent  of  the  WSMR  Image  processing  laboratory  has  enabled  USNR 
personnel  to  Investigate  several  novel  Image -processing  techniques  In  an 
effort  to  develop  an  RTV  tracking  system  that  Is  robust  under  the  normal 
variety  of  tracking  situations.  The  research  can  be  divided  Into  two 
major  areas:  (1)  extraction  of  potential  targets  from  the  background, 
and  (2)  classification  of  the  extracted  objects  as  targets  and  nontargets. 
Research  personnel  at  Purdue  University,  NMSU,  and  the  University  of 
Arizona's  OSC  have  contributed  to  the  success  of  the  HSNR  Image  proc- 
essing effort. 


THRESHOLDING  TECHNIQUES  FOR  TARGET  EXTRACTION 

The  simplest  methods  available  for  extracting  potential  targets  from  the 
background  utilize  thresholding  operations.  The  method  used  by  the  video 
processor  In  the  RTV  tracker  Is  a variation  of  thresholdlno  In  which  the 
histogram  of  the  background  region  of  the  tracking  window  (assimlng  a 
target  with  no  plune)  Is  used  to  Identify  potential  target  points.  For 
example.  If  the  Intensity  of  a pixel  within  the  target  region  does  not 
occur  within  the  background  region,  the  pixel  Is  Identified  as  a poten- 
tial target  point.  The  threshold  In  this  case  1s  defined  by  the  distri- 
bution of  Intensities  In  the  background  histogram. 

A typical  digitized  video  Image  which  demonstrates  the  utility  of  this 
method  Is  shown  before  and  after  target  extraction  In  figure  4.1.  His- 
tograms of  the  target  and  background  regions  are  Included  In  figure  4.2. 
These  gray  level  plots  and  the  associated  histograms  were  generated  by 
allowing  the  DHIST  program  to  operate  on  selected  windows  within  the 
background  and  target  regions  of  the  digitized  Image.  The  target  region 
In  this  case  Is  the  entire  picture,  while  the  background  Includes  the 
regions  near  the  top  and  bottom  edges  of  the  picture.  Using  DHIST, 
analyses  of  a variety  of  tracking  Images  have  verified  the  effectiveness 
of  histogram  comparison  method  of  target  extraction  for  most  tracking 
Imagery  encountered  at  WSMR.  The  target  Intensities  are  easily  Identified 
In  figure  4.2  as  those  Intensities  In  the  target  region  histogram  which 
do  not  occur  (or  which  occur  less  than  about  30  times.  In  this  example) 

In  the  background  histogram;  I.e.,  those  Intensities  quantized  at  gray 
levels  below  128  In  the  target  region  histogram.  The  gray  level  plot  of 
figure  4.1(a)  was  generated  by  assigning  a white  level  of  164  and  a 
black  level  of  100  as  the  highest  and  lowest  Intensities,  respectively, 
to  be  plotted  as  half-tone  shades  of  gray.  These  limits  Include  all 
Intensity  values  which  occur  In  the  digitized  Image.  The  gr^  level 
plot  of  figure  4.1(b)  was  generated  by  assigning  black  and  white  levels 
of  100  and  128,  respectively,  thus  causing  all  Intensity  levels  having 
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a value  of  128  or  greater  to  be  displayed  as  white.  The  half-tone  pic- 
tures of  figure  4.1  were  generated  using  a 4 x 4 dot  Matrix  for  each 
resolution  eleMent;  hence,  17  shades  of  gray  are  displayed. 

Additional  Intelligence  can  be  Incorporated  Into  the  sinple  threshold- 
ing Method  of  target  extraction  by  using  the  Measured  histograMs  of 
target,  background,  and  pliaae  Intensities  to  coepute  learned  estiMates 
of  the  probability  density  functions  of  intensities  within  each  of  the 
three  regions  of  the  tracking  window.  A Bayesian  classifier  can  then 
use  these  estiMates  to  extract  potential  target  points.  Additional 
features  such  as  texture,  gradient,  and  linearity  Might  also  be  coih 
pared  aMong  saMpled  regions  and  used  to  devise  an  optimum  extraction 
algorithm.  Alternatively,  the  sampled  regions  could  be  reduced  in  size 
and  increased  in  number  to  provide  sampling  windows  over  the  entire 
image  whose  distinctive  features  are  compared  statistically  to  assign 
each  window  to  a target,  plume,  or  background  region.  Target  extrac- 
tion and  region  growing  algorithms  which  incorporate  these  and  other 
ideas  are  being  developed  for  testing  in  the  prototype  RTV  tracking 
system  as  part  of  ongoing  research  efforts  at  WSW,  NMSU,  and  Purdue 
University. 


PREPROCESSING  TO  ENHANCE  TARGET  FEATURES 

With  the  large  variety  of  incoming  images  which  are  presented  to  an  opti- 
cal tracker,  it  may  be  necessary  to  process  the  images  before  target 
extraction  is  attempted  to  enhance  distinctive  features  of  the  image 
which  indicate  the  presence  of  targets.  For  example,  man-made  objects 
generally  have  distinctive  textures  and  edges  which  may  be  enhanced  and 
used  in  target  extraction.  The  following  preprocessing  operations  are 
being  investigated  at  WSHR  and  Purdue  University:  a 2-dimensiona1  median 
filter,  an  averaging  filter,  a 2-dimensiona1  bandpass  filter,  a local 
extrema  operator,  and  selected  combinations  of  these  operations.  In 
addition,  a maximum  entropy  image  restoration  technique  (Frieden,  1977), 
developed  at  the  University  of  Arizona's  OSC,  has  been  applied  to  USNR 
tracking  imagery.  Other  preprocessing  operations  to  be  investigated  as 
part  of  the  continuing  research  effort  at  WSMR  include  a variety  of 
gradient  operators,  a moment  operator,  and  a logarithmic  input  trans- 
formation. 

All  of  the  preprocessing  algorithms  currently  being  tested  at  USNR  are 
fonnulated  in  terms  of  filter  windows  which  are  either  convolved  with 
the  digitized  image  or  shifted  and  combined  with  the  image  pixels  in  a 
nonlinear  fashion.  The  FILTER  program  allows  the  user  to  operate  on 
any  size  input  image  by  simply  selecting  the  desired  operation  and  win- 
dow size.  Presently  only  rectangular  windows  may  be  selected,  and  thqy 
must  slide  horizontally  across  the  picture  as  the  operation  is  performed. 
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Future  research  at  Purdue  University  will  Investigate  the  use  of  a win- 
dow whose  shape  Is  Modified  when  edge  points  are  detected,  thus  allowing 
nonrectangular  windows  to  operate  along  target  boundaries. 

The  median  filter  performs  a nonlinear  operation  which  eliminates  high 
frequency  noise  while  preserving  monotonic  edges  precisely.  A simple 
median  filter  replaces  the  Intensity  of  the  center  pixel  of  each  3x3 
window  with  the  median  Intensity  of  all  nine  pixels  within  the  window. 
Shown  In  figure  4.3  (upper  left  comer)  Is  the  original  digitized  Image 
of  an  airplane  turning  away  from  the  camera.  The  result  obtained  by 
processing  this  Image  with  a 3 x 3 median  filter  window  Is  shown  In  the 
upper  right  comer  of  figure  4.3.  The  edges  are  preserved,  but  the 
noise  Is  reduced  and  the  data  Is  more  correlated.  The  two  bottom  pic- 
tures of  figure  4.3  Illustrate  the  results  obtained  by  applying  the  sim- 
ple histogram  comparison  technique  described  above  to  display  thresholded 
versions  of  the  original  (lower  left)  and  median  filtered  (lower  right) 
Images . 

The  averaging  filter  replaces  the  central  pixel  of  each  window  with  the 
average  Intensity  of  all  of  the  pixels  within  the  window.  The  result 
Is  a smoothed  version  of  the  original  picture.  It  Is  most  effectively 
used  In  conjunction  with  the  median  filter.  The  median  filter  removes 
most  of  the  noise  points,  and  then  the  averager  smears  the  remaining 
Image  points  to  produce  an  optimum  Image  for  thresholding.  Figure  4.4 
contains  (from  top  to  bottom)  the  original  digitized  Image  of  a cruise 
missile,  the  median  filtered  Image,  and  the  result  of  averaging  the 
median  filtered  Image. 

It  Is  well  known  that  the  human  visual  system  responds  to  incoming 
Images  with  an  approximate  logarithmic  transformation  followed  by  a 
spatial  and  temporal  bandpass  filtering  operation.  The  usefulness  of 
such  a spatial  bandpass  filter  In  extracting  objects  for  tracking  is 
being  Investigated. 

A suitable  filter  function  which  Is  to  be  convolved  with  an  Input  Image 
Is  shown  In  table  4.1.  The  values  are  chosen  to  make  computation  of 
the  fractional  values  easy  for  real-time  Implementation.  Because  the 
video  data  has  twice  as  much  resolution  In  the  horizontal  direction  as 
In  the  vertical  (only  one  video  field  Is  processed  at  a time),  the  band- 
pass filter  Is  modified  as  shown  In  table  4.2.  Both  filters  have  a dc 
response  of  0.5  and  a gain  of  7.5  for  a small  object  which  just  covers 
the  plus  area. 

The  first  two  pictures  In  figure  4.5  present  the  results  of  convolving 
a digitized  missile  Image  with  the  filter  function  of  table  4.2.  The 
first  picture  Is  the  original  Image.  The  missile  becomes  much  more 
visible  In  the  second  picture  because  of  the  edge  enhancement  produced 
by  the  bandpass  filter. 
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TABLE 

4.1. 

2- DIMENSIONAL  BANDPASS  FILTER 

FUNCnON 

0 

0 

0 

0 

- .25 

• 0 

0 

0 

0 

0 

0 

0 

- .25 

- .5 

- .25 

0 

0 

0 

0 

0 

-.25 

- .25 

+ .25 

- .25 

.25 

0 

0 

0 

-.25 

-.25 

+ .5 

+1.0 

+ .5 

.25 

-.25 

0 

25 

-.5 

+ .25 

+1.0 

+1.5 

+1 .0  + 

.25 

-.5 

-.25 

0 

-.25 

-.25 

+ .5 

+1.0 

+ .5 

.25 

-.25 

0 

0 

0 

-.25 

- .25 

+ .25 

- .25 

.25 

0 

0 

0 

0 

0 

- .25 

- .5 

- .25 

0 

0 

0 

0 

0 

0 

0 

- .25 

0 

0 

0 

0 

TABLE  4.2.  MODIFIED  BANDPASS  FILTER  TO  ALLOW  HALF  RESOLUTION 
IN  VERTICAL  DIRECTION 


0 

0 

0 

- .25 

- .75 

- .25 

0 

0 

0 

0 

-.25 

- .5 

+ .25 

+1.25 

+ .25 

-.5 

-.25 

0 

25 

-.5 

+.25 

+1.0 

+1.5 

+1.0 

+.25 

-.5 

-.25 

0 

-.25 

-.5 

+ .25 

+1.25 

+ .25 

-.5 

-.25 

0 

0 

0 

0 

- .25 

- .75 

- .25 

0 

0 

0 
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The  third  picture  (far  right)  in  figure  4.5  shows  the  results  of  passing 
the  digitized  image  through  the  bandpass  filter  and  the  local  extrema 

operator.  In  a nxm  window,  the  extrema  operator  compares  the  gray  level 

of  the  center  point  with  those  of  its  two  vertical  neighbors.  If  it  is 
above  both  neighbors,  the  center  point  is  a local  maximum  in  the  verti- 
cal direction.  If  this  is  the  case,  the  center  point  is  coirpared  with 
each  point  along  each  vertical  direction  until  a gray  level  is  encoun- 
tered which  is  above  the  center  point's  value  or  until  the  edge  of  the 
window  is  encountered.  The  largest  differences  between  gray  levels  in 
each  vertical  direction  are  then  compared,  and  the  smallest  of  the  two 
is  retained  as  the  size  of  the  local  maximum  in  the  vertical  direction. 

An  example  is  shown  in  table  4.3  for  a 5 x 7 window. 

The  center  value  is  45.  The  47  ends  the  search  in  the  top  direction, 

and  the  50  ends  the  search  in  the  bottom  direction.  The  range  is  15 
above  and  12  below.  Therefore,  the  center  point  is  a local  maximum  in 
the  vertical  direction  of  size  12.  If  the  point  is  a local  minimum 
Instead  of  a maximum,  the  process  is  done  in  the  same  way,  interchang- 
ing the  above  and  below  comparison  tests. 

This  process  is  also  done  in  the  horizontal  direction.  In  the  example 
of  table  4.3,  the  center  point  is  not  a local  extrema  in  the  horizontal 
direction.  If  a point  is  a local  extrema  in  both  horizontal  and  verti- 
cal directions,  only  the  largest  of  the  two  is  retained  at  that  location. 
The  extrema  detection  process  is  equivalent  to  local  maximum  and  minimum 
deterniination  following  hysteresis  smoothing  of  various  amounts. 

In  figure  4.5  (right),  the  extrema  sizes  are  indicated  by  the  displayed 
gray  levels.  No  distinction  is  made  between  horizontal  and  vertical 
extrema.  The  edges  emphasized  by  the  bandpass  filter  are  marked  by  the 
extrema.  The  missile  orientation  can  be  extracted  from  its  edge  infor- 
mation. 

The  texture  of  various  regions  can  also  be  characterized  by  the  types 
and  number  of  extrema  present.  The  region  characterization  may  be  use- 
ful for  background  classification  and  for  plume  identification.  Para- 
meters for  this  measurement  are  extracted  by  counting  the  number  of 
extrema  of  various  sizes  within  a window  surrounding  each  point. 

The  results  of  a maximum  entropy  restoration  of  the  digitized  image  of 
a cruise  missile  are  presented  in  the  left  half  of  figure  4.6.  The  bot- 
tom left  picture  is  the  original  digitized  image,  the  center  picture  is 
a smoothed  image  obtained  by  convolving  the  original  image  with  a Gaus- 
sian spread  function,  and  the  top  left  picture  is  the  restored  image. 

By  comparing  the  restored  image  with  a scale  drawing  of  the  cruise  mis- 
sile (right  half  of  figure  4.6),  the  remarkable  accuracy  of  the  restora- 
tion is  verified,  particularly  in  the  wedge  shape  of  the  nose  and  the 


TABLE  4.3.  SAfPLE  GRAY  LEVELS  FOR  IXTREMA  DETECTION 


36 

40 

47 

30 

24 

33 

34 

30 

32 

36 

36 

40 

32 

40 

30 

42 

46 

45 

43 

35 

36 

40 

33 

47 

32 

34 

42 

50 

42 

40 

30 

30 

20 

45 

36 

48 


detail  of  the  air  inlet.  The  maximum  entropy  restoration  cannot  be 
accomplished  in  real-time.  However,  using  array  processors  it  may  be 
possible  to  utilize  this  technique  for.near  real-time  applications  which 
require  the  image  to  be  restored  within  a few  hundred  milliseconds. 


CLASSIFICATION  OF  EHRACTED  OBJECTS 

As  was  described  previously,  the  RTV  tracker  contains  a projection  proc- 
essor which  accumulates  projections  of  the  potential  targets  extracted 
by  the  video  processor,  and  a tracker  processor  which  utilizes  the  1/8 
percentile  points  of  the  projections  to  characterize  the  object  shape. 
Each  object  is  classified  as  a target  or  nontarget  based  on  preassigned 
shape  factors  and  1/8  percentile  points  obtained  from  previous  frames. 
This  type  of  shape  analysis  based  on  projection  data  is  producing  excel- 
lent results  on  digitized  video  data  in  the  RTV  simulation. 

An  alternative  method  of  shape  analysis  currently  being  investigated  by 
personnel  at  WSMR  and  Purdue  University  utilizes  Fourier  descriptors  of 
the  contours  of  the  extracted  objects  to  generate  coiqplexity  measures 
which  can  be  used  to  classify  objects  as  targets  and  nontargets.  A 
detailed  description  of  ongoing  research  in  this  area  is  presented  in 
a previous  WSMR  technical  report.^ 


T.  Fukunaga,  k. , A.  L.  Gilbert,  M.  K.  Giles,  0.  R.  Mitchell,  R.  D.  Short, 
and  J.  M.  Taylor,  "Segmentation  and  Structure  Analysis  for  Real-Time 
Video  Target  Tracking,"  WSMR  Technical  Report,  STEWS-ID-77-1 , October  1977. 
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(a)  (b) 


Figure  4.1.  Digitized  Video  Image  (a)  before  Target 
Extraction  and  (b)  after  Target  Extrac- 
tion using  a Thresholding  Technique 
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Figure  4.2.  Intensity  Histograms  of  Background  and  Target  Regions  of  Figure  4.1(a) 


Figure  4.3.  Preprocessing  with  a Median  Filter 
(Top  left— original  digitized  image, 
top  right— median  filtered  image, 
bottom  left— original  image  after 
thresholding,  and  bottom  right- 
median  filtered  image  after  Thresh- 
olding) 
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Abstract 

Advancenents  In  technology  over  the  past  decade  have  opened  doors  for 
accomplishing  computational  tasks  that  were  not  Imagined  possible  at  the 
beginning  of  that  period.  Coupled  with  some  recent  concepts  In  pattern 
recognition  and  artificial  Intelligence,  optical  tracking  system  configure 
tions  with  excellent  tracking  reliability  and  with  the  capability  to  cor- 
rect for  boresight  error  In  real-time  are  within  the  scope  of  current 
technology.  Aspects  of  the  problem  and  the  new  supporting  technology  are 
discussed  In  this  paper  to  put  these  developments  Into  perspective. 


Introduction 


Optical  tracking  has  been  a mainstay  of  accurate  metric  range  Instru- 
mentation since  the  first  testing  of  modem  rocketry  during  and  following 
the  Second  World  War.  The  accuracies  that  were  possible  from  optical 
Instruments  exceeded  those  from  other  available  Instruments.  Improvements 
In  encoders,  optical  testing,  modelling  of  the  atmosphere,  and  optical 
design  continuously  Improved  the  accuracies  of  optical  Instruments.  The 
major  drawback  was  the  required  film  processing  which  delayed  the  delivery 
of  boresight  corrected  optical  data. 

Recent  changes  In  technology  not  directly  related  to  optics  have 
created  the  potential  for  relieving  part  of  the  delay  problem  In  data 
delivery.  Automatic  tracking  methods  using  high-speed  microprocessors, 
artificial  Intelligence,  and  pattern  recognition  techniques,  together  with 
special  modifications  to  the  existing  optical  systems,  are  now  available 
to  perform  most  of  the  film  reading  function  In  an  on-line,  real-time  mode. 
These  methods  far  exceed  the  conventional  contrast,  edge,  and  correlation 
trackers  In  sophistication  and  capability,  since  they  are  based  upon  an 
understanding  of  some  definable  properties  of  the  Image  Involving  many 
parameters  as  compared  to  only  a few. 


The  Intelligence  of  Object  Identification 

Pattern  recognition  Is  a mathematical  science  based  upon  the  separation 
of  a parameter  space  Into  two  or  more  regions,  so  that  when  the  parameter 


Is  measured  It  may  be  classified  as  belonging  to  one  of  the  appropriate 
regions.  It  follows  that  a vector  parameter  trill  give  rise  to  a parame- 
ter space  of  dimensionality  equal  to  the  nuinber  of  Independent  elements 
In  the  vector  Implying  that  for  an  N-vector  the  required  separation  Is  a 
hyperplane  In  N space  (See  Figure  1).  If  the -parameter  Is  a single  ele- 
ment vector,  an  assignment  can  be  made  on  the  basis  of  a single  threshold 
on  the  real  numbers  and  a tracker  can  be  built  that  uses  this  decision 
rule.  An  example  we  call  a contrast  tracker  uses  a threshold  on  bright- 
ness for  the  assignment.  A preprocessing  algorithm  may  be  placed  before 
the  decision.  If  we  preprocess  for  magnitude  of  change  In  Intensity,  the 
same  thresholding  rule  will  yield  an  edge  tracker.  These  are  amongst  the 
simplest  applications  of  pattern  recognition  to  the  object  Identification 
problem.  The  very  simplicity  of  the  method  produces  the  major  drawbacks 
to  the  application.  Since  these  algorithms  are  easily  confused,  many 
spurious  objects  In  the  field  of  view  (FOV)  often  meet  the  classification 
criteria. 


FIGURE  1 
DECISION  SPACE 


A somewhat  different  approach  that  uses  an  array  of  points  and  measures 
the  closeness  of  fit  to  a subsequently  measured  similar  array*  while  choos- 
ing the  best  match  as  the  correct  location.  Is  generally  known  as  a correla 
tion  tracker.  The  decision  Is  again  based  upon  a single  element  parameter 
vector  (the  closeness  of  fit),  but  the  preprocessing  of  data  Is  much  more 
elaborate.  While  this  approach  offers  an  Improvement  In  confidence  that 
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the  correct  object  has  been  located  If  the  object  description  Is  knoMn, 

It  suffers  from  two  principal  problems.  The  first  Is  that  generally  the 
object  being  tracked  changes  appearance  continually  while  objects  In  the 
background  may  not.  This  requires  an  a'daptive  object  description  which 
may  slowly  converge  to  the  acceptance  of  an  undesired  object  as  the 
desired  one.  The  second  problem  Is  that  this  approach  requires  a very 
large  amount  of  processing  to  do  a good  job,  since  the  optimal  linear 
process  would  be  convolution  of  the  NxM  object  description  array  over  the 
PxQ  array  of  data  points,  and  generally  P»N,  Q»M.  A commonly  used  sim- 
plification Is  an  additive  (subtractive)  algorithm  that  seeks  the  best 
fit  of  the  desired  array  to  the  data.  Instead  of  the  convolution.  This 
approach  necessarily  results  In  loss  of  tracker  performance.  For  these 
reasons,  the  correlation  tracking  method  Is  generally  limited  to  very 
restricted  window  tracking  and  fairly  slow  update  rates. 

Approaches  to  real-time  optical  tracking  have  generally  been  limited 
to  these  approaches,  for  the  following  principal  reasons.  The  first  and 
most  Important  has  been  the  magnitude  of  the  real-time  processing  require- 
ment for  the  more  elaborate  approaches,  which  have  exceeded  computational 
resources  generally  available.  The  second  has  been  a lack  of  Image  under- 
standing that  would  allow  the  formulation  of  more  reliable,  yet  simple, 
approaches.  Substantial  progress  has  recently  been  made  in  the  former, 
and  there  are  many  encouraging  new  developments  In  the  latter. 

A variety  of  methods  of  Image  data  processing  have  become  known  over 
the  past  decade.  Much  of  the  effort  has  been  sponsored  by  the  Defense 
Advanced  Research  Projects  Agency  (DARPA)  and  by  NASA  to  further  scientif- 
ic understanding  of  Imagery  and  Image  processing.  These  agencies  continue 
to  sponsor  Image  understanding  research,  and  the  various  military  services 
through  their  research  sponsoring  offices  as  well  as  the  National  Science 
Foundation  have  also  become  heavily  Involved  In  pattern  recognition  and 
Image  understanding  research.  Appllcatlons-orlented  research  at  the  US 
Amy  White  Sands  Missile  Range  (WSMR)  and  at  the  US  Amy  Night  Vision 
Laboratories  (NVL)  has  lead  recently  to  systems  of  reasonably  high  sophis- 
tication using  concepts  developed  In-house  and  through  sponsored  research 
to  solve  complex  Identification  and  tracking  problems.  WSMR  has  concen- 
trated on  objects  In  the  visible  spectrum  and  In  real-time,  while  NVL  has 
been  primarily  concerned  with  the  Infrared  and  In  near  real-time.  Many 
other  systems,  not  necessarily  real-time,  have  been  developed  for  applica- 
tions In  medicine,  meteorology,  and  space  research. 

Many  of  the  newer  methods  involve  the  use  of  many  elements  In  the 
parameter  vector  to  glean  more  Information  from  the  data.  In  applying 
pattern  recognition  methods  to  the  object  Identification  problem,  the 
engineer  Is  trying  to  minimize  the  amount  of  data  he  must  handle  and  maxi- 
mize his  confidence  that  he  made  the  correct  decision.  Any  linear  proc- 
ess will  preserve  the  quantity  of  data  (260,000  points  for  a 512x512 
Image,  possibly  8 bits  per  point)  which  Is  obviously  not  desirable  If 
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much  processing  Is  required  to  make  a decision.  The  engineer  Is  forced 
to  require  a high  degree  of  parallel  processing  on  linear  processes,  and 
to  perform  nonlinear  operations  to  reduce  the  data  quantity  prior  to 
determining  the  values  of  the  parameter  elements  used  In  the  decision 
rule.  Ideally,  the  dimensionality  of  the  decision  space  should  be  kept 
reasonably  small  to  allow  decisions  to  be  made  In  real-time  or  In  near 
real-time. 

Some  of  the  preprocessing  methods  currently  In  use  are: 

Filtering 

A general  class  of  operations  that  Involve  the  convolution  of  a point 
spread  function  array  with  the  Image  to  achieve  some  desired  objective 
with  the  Image.  Examples  include  removing  spatially  Invariant  degrada- 
tions due  to  the  optics  or  atmosphere,  boosting  the  high  frequency  con- 
tent of  the  Image  to  enhance  edges,  removing  noise  In  the  Image,  making 
the  Image  more  pleasing  to  the  eye,  and  other  such  operations.  Generally 
those  operations  which  remove  degradations  are  called  estimation  and 
those  which  emphasize  some  spatial  frequencies  or  some  aspects  of  the 
Image  are  called  enhancement.  It  must  be  noted  that  enhancement  Is  an 
Intentionally  Introduced  distortion  to  produce  some  desired  effect. 

Transforms 

The  class  of  operations  that  maps  the  Image  Into  a new  domain  where 
the  elements  In  the  new  domain  are  a measure  of  some  property  of  the 
original  Image.  The  most  common  example  Is  the  discrete  Fourier  trans- 
form (DFT),  especially  In  the  fast  al^rlthms  (FFT).  The  DFT  Identifies 
the  spatial  frequency  content  of  the  Image,  which  will  allow  further 
processing  based  upon  these  components.  A class  of  binary  Fourier  (BIFORE) 
transforms  has  been  developed  over  the  past  decade  which  are  similar  to 
the  OFT  but  are  more  suited  to  computer  applications.  These  might  be  i 

called  lesser  transforms  since  they  do  not  represent  the  Information  In 
the  Image  as  completely.  Because  they  are  much  more  efficiently  run  on 
a computer  than  the  DFT,  they  have  Important  applications  In  image  trans- 
formations. Among  the  lesser  transforms  are  the  now  popular  Hadamard 
transform  based  upon  Walsh  functions,  and  the  less  known  but  simple  Haar 
transform.  To  conceptually  accommodate  these  lesser  transforms,  the 
notion  of  sequency  as  a conceptual  equivalent  to  frequency  was  developed. 

Sequency  Is  defined  simply  as  half  the  number  of  zero  crossings  per  unit 
Interval.  These  transforms  compute  the  concentration  of  Image  energy  In 
these  sequency  components.  This  can  be  useful  for  Identifying  features 
of  Interest  In  the  Image.  It  Is  necessary,  of  course,  to  apply  all  of 
these  transfroms  In  a 2-d1mens1onal  algorithm  to  process  the  2-d1mens1ona1 
Images.  The  Hadamard  transform  Is  particularly  suited  to  computer  opera- 
tions on  Images. 

1 

I 


Point  Processing 


Individual  points  In  the  Image  are  .assigned  new  values  based  upon  some 
assignment  rule.  This  may  take  a variety  of  forms  with  a large  variation 
In  apparent  results.  One  point  processing  al^rithm  averages  the  corre- 
sponding point  of  several  frames  or  sequential  Images  to  produce  a weighted 
composite  and  remove  transient  degradations.  Another  assigns  all  values 
above  a given  threshold  to  1 and  all  values  below  to  0.  This  Is  known  as 
thresholding.  A variation  on  thresholding  Is  to  assign  predetermined  gray 
levels  to  1 even  though  these  may  not  be  In  a continuous  range.  Still 
another  algorithm,  known  as  contrast  stretching,  assigns  all  values  below 
some  Intensity  Iq  to  0;  all  values  above  another  Intensity  I-j  to  the  maxi- 
mum gray  level,  say  256;  and  stretches  the  Intermediate  values  to  occupy 
the  full  range  (see  Figure  2).  Generally,  point  processing  methods  are 
nonlinear,  yielding  fewer  bits  in  the  output  than  in  the  data  array. 


FIGURE  2 

CONTRAST  STRETCHING 


The  next  step  in  the  process  is  to  identify  the  values  of  the  elements 
in  the  parameter  vector.  These  elements  may  be  such  things  as  size,  ori- 
entation, number  of  comers,  brightness,  etc.  When  joined  in  a single 
parameter  vector,  they  describe  all  we  think  we  need  to  know  to  adequately 
describe  the  object  for  purposes  of  Identification. 

An  example  of  this  process  of  evaluating  the  elements  of  a parameter 
vector  arises  from  the  WSNR/New  Mexico  State  University  (NMSU)  effort  on 
the  Real-Time  Videotheodolite.  The  technique  Is  creditable  to  Dr.  Gerald 
Flachs  and  Yee  HSun  U of  NMSU. 


First,  the  video  is  point  processed  with  an  adaptive  nonlinear  algo- 
rithm that  assigns  each  point  either  a 1 or  a 0 depending  upon  whether, 
according  to  a Bayesian  decision  rule,  .it  is  classified  as  a potential 
target  or  probable  background  point.  This  yields  one  or  more  connected 
regions  of  1's  which  then  must  be  classified.*  A projection  technique  is 
applied  for  each  such  connected  region  where  the  number  of  1's  counted 
both  horizontally  and  vertically  are  projected  on  the  vertical  and  hori- 
zontal axes,  respectively  (see  Figure  3).  The  projections  are  each  seg- 
mented into  eight  equal  area  segments,  and  the  segment  lengths  normalized 
by  the  projection  length.  The  result  is  a parameter  vector  of  16  ele- 
ments, eight  horizontal  projection  segment  lengths  and  eight  vertical  pro- 
jection segment  lengths.  If  the  parameter  vector  is  close  to  the  stored 
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FIGURE  3 

PROJECTIONS  AND  PERCENTILE  POINTS 
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Ideal  vector  In  16-space,  a decision  Is  made  that  the  object  being  tested 
Is  the  correct  object  for  tracking.  The  actual  WSNR/NMSU  algorithm  takes 
this  one  step  further  and  separates  the.  vertical  projections  onto  the 
horizontal  axis  Into  a top  and  a bottom  half  to  determine  an  orientation 
angle  as  well. 

It  can  be  seen  that  by  using  this  method,  the  amount  of  data  Is  rapid- 
ly reduced,  first  during  point  processing  from  8 bits  per  picture  element 
to  1 bit  per  picture  element,  and  subsequently  after  the  projections  to 
only  a few  bits  per  image.  This  data  compression  Is  essential  to  enable 
currently  available  microprocessors  to  keep  pace  with  the  standard  video 
rates  of  60  fields  per  second,  a goal  accomplished  by  this  system. 

Finally,  it  should  be  noted  that  a human  incorporates  many  elements 
into  the  parameter  vector  that  he  uses  to  Identify  an  object.  The  diffi- 
culty of  understanding  the  human  visual  process  has  caused  rather  slow 
progress  In  teaching  computers  to  “see."  We  know  that  the  human  uses  such 
things  as  texture,  orientation,  color,  size,  shading,  shape,  context,  etc. 
to  identify  objects.  Parameter  vectors  which  incorporate  these  elements 
are  difficult  to  compute,  especially  In  real-time.  Additionally,  certain 
of  these  characteristics  are  difficult  to  quantify.  It  is  not  necessary, 
however,  to  require  the  computer  to  see  the  same  things  a human  does.  It 
is  difficult  to  visualize  elements  of  parameter  vectors  that  do  not  have 
a physical  meaning  to  a human,  but  which  may  be  useful  for  computer  recog- 
nition processes.  Much  work  remains  to  be  done  to  produce  a highly  sophis 
ticated  sight  process  In  a computer. 


Sensors  and  Associated  Problems 


The  intelligence  of  object  identification  Is  only  one  of  the  obstacles 
to  be  overcome  by  the  engineer.  True,  It  has  been  one  of  the  hardest  prob 
lems  to  solve  due  to  lack  of  understanding  of  Image  properties.  Another 
troublesome  problem,  however.  Is  the  selection  of  a transducer  on  which  to 
image  the  scene  containing  the  object  to  be  Identified. 

Generally  there  are  three  problems  that  must  be  overcome.  These  are 
resolution,  timing,  and  sensitivity,  each  of  which  is  discussed  in  turn. 

It  is  essential  that  these  problems  be  Isolated  and  solved  to  yield  a sys- 
tem that  will  operate  satisfactorily. 

Resolution  is  an  old  problem  well  known  to  the  optical  engineer.  It 
is  simple  to  establish  that  resolution  will  not  exceed  the  diffraction 
limit  of  the  optical  system.  Generally,  however,  video  transducers  will 
not  have  near  the  resolution  capability  of  the  optical  system  to  which 
they  are  attached.  A good  optical  system  may  have  300  line  pairs  per  mil- 
limeter resolution  capability,  with  a vidicon  attached  that  Is  limited  to 


about  40  line  pairs  per  millimeter.  This  may  not  be  sufficient  for  the 
required  accuracy.  A straightforward  solution  would  be  to  interpose  a 
magnification  of  five  times  so  that  the.  image  plane  resolution  is  40  line 
pairs  per  millimeter  with  the  transducer  the  same.  This  solution,  how- 
ever, forces  much  tighter  control  over  the  servo  drives  since  the  FOV  is 
drastically  reduced,  and  complicates  the  acquisition  problem.  A better 
solution  is  to  use  a zoom  lens  that  may  be  driven  to  a wide  FOV  for  high 
dynamic  situations  and  acquisition,  and  a narrow  FOV  for  low  dynamic 
tracking.  Design  of  the  zoom  lens  will  be  complicated  by  focusing  prob- 
lems, run-out,  required  zoom  transducers  to  adjust  the  magnitudes  of  the 
azimuth  and  elevation  boresight  corrections,  and  shuttering.  Additionally, 
the  zoom  lens  should  not  severely  degrade  the  optical  quality  of  the 
instrument  to  which  it  is  attached. 

It  is  noteworthy  that  high  resolution  vidicons  are  available,  but  the 
data  bandwidth  is  generally  too  large  to  allow  real-time  processing.  If 
a sensor  produces  more  than  about  500  measurable  image  points  in  63.4 
microseconds  (standard  TV)  no  present  sequentially  operated  microproc- 
essor can  keep  up.  The  high  resolution  vidicons,  then,  must  be  reserved 
for  lower  frame  rates  or  larger  computer  processing  arrangements. 

Finally,  video  transducers  are  often  subject  to  bums  due  to  direct 
exposure  to  the  sun.  In  these  cases,  it  is  necessary  to  provide  a sun 
shutter  or  some  means  of  protecting  the  device  from  bums. 

Timing  problems  arise  in  a rather  oblique  fashion.  It  is  no  problem 
to  synchronize  video  with  IRIG  A or  B,  apparently  solving  the  timing  prob- 
lem. This  is  not  true,  however.  Video  sensors  are  generally  scanned 
devices.  The  time  elapse  between  the  scanning  of  a single  corresponding 
point  one  frame  apart  is  well-established  at  1/30  second  for  monochrome 
TV.  For  noncorresponding  points,  say  (Xq,  y^)  in  frame  1 and  (x^,  y^)  in 

frame  2,  the  time  separation  is  not  exactly  one  frame  apart.  Further, 
interlacing  complicates  the  problem  even  more,  since  adjacent  lines  are 
separated  in  time  by  approximately  1/60  second. 

For  targets  of  high  dynamics  internal  to  the  FOV,  it  is  necessary  to 
treat  each  field  as  a separate  data  sample  array,  and  process  it  independ- 
ently. This  yields  60  samples  per  second  which  must  be  processed.  The 
results  may  be  used  at  that  rate  or  averaged  down  to  the  more  standard  20 
samples  per  second  range  data  rates. 

The  final  difficulty  is  that  even  though  adjacent  lines  are  separated 
by  1/60  second,  the  object  being  tracked  moves  about  in  the  FOV,  and  it  is 
nonuniformly  sampled  in  time.  Of  particular  concern  is  variation  in  ele- 
vation since  the  scan  process  takes  1/60  second  from  top  to  bottom.  The 
algorithms  for  estimating  position  at  any  given  time,  given  a nonuniformly 
sampled  data  output,  is  rather  complex,  so  another  solution  is  desired. 
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The  most  straightforward  is  to  gate  the  Image  onto  the  Image  plane,  using 
a short  enough  gate  to  freeze  motion.  The  gate  may  be  spa(%d  uniformly 
In  time,  thereby  eliminating  the  nonuniform  sampling  problem.  The  Image 
stored  on  the  vidicon  then  may  be  scanned  In  the  conventional  manner,  but 
the  whole  Image  may  be  treated  as  having  occurred  at  some  known  time. 

Finally,  the  question  of  transducer  sensitivity  must  be  considered. 

If  It  were  not  for  the  problems  of  resolution  and  timing,  the  sensitivity 
question  would  not  arise.  Generally,  black  and  white  vidicons  are  more 
sensitive  than  film  (higher  equivalent  ASA).  Addition  of  a 5X  zoom  will 
reduce  available  light  by  about  five  stops,  but  will  vary  throughout  the 
zoom  ran^.  Additionally,  If  the  vidicon  Is  gated  with,  say,  50  micro- 
second windows  at  60  gates  per  second,  another  8 1/2  stops  of  intensity 
reduction  Is  Introduced.  At  this  point,  the  transducer  sensitivity  will 
be  Inadequate  for  the  T number  of  the  system.  It  will  generally  be  neces- 
sary to  solve  this  problem  through  either  the  optical  design  (probably  the 
expensive  solution)  or  electronically.  A reasonable  approach  Is  the  use 
of  an  Image  intensifler  that  can  Incorporate  the  gating  electronically, 
eliminating  the  need  for  mechanical  shuttering  and  Iris  control. 


Data  Handling  and  Storage 

It  takes  but  a few  moments  of  calculations  to  show  that  if  each  line 
is  segmented  Into  512  data  points,  and  there  are  480  visible  lines  in  a 
frame,  all  occurring  In  1/30  second,  then  7.4  million  data  points  must  be 
handled  each  second  for  full-frame  processing  (restricted  window  process- 
ing Increases  tracking  difficulties  In  acquisition  and  tracking  under  high- 
dynamic  conditions,  and  reduces  resolution  by  competing  with  the  zoom 
system  for  restricted  FOV).  Taking  the  data  points  at  the  rate  they  occur 
in  the  scanned  line,  each  point  must  be  completely  processed  In  about  100 
nanoseconds.  For  even  the  simplest  point  processing,  these  data  rates 
represent  a challenge.  At  present  only  a limited  number  of  microprocessors 
are  capable  of  approaching  these  speeds.  These  include  the  TI74S481  bit 
slice  and  the  ADK9  family.  The  Real-Time  Videotheodol Ite  developed  by 
WSMR/NMSU  uses  the  TI74S481,  which  has  been  found  to  perform  satisfactorily 
at  a 200-nanosecond  cycle  time. 

By  using  a nonlinear  point  process  algorithm  to  compress  the  data  and 
hard-wiring  the  algorithm  In  some  fashion,  the  amount  of  data  that  results 
Is  within  the  capability  of  the  microprocessor.  Several  special  design 
techniques  are  necessary  to  handle  and  transfer  the  data  from  one  processor 
to  the  next  so  that  the  whole  tracking  problem  can  be  solved  In  real-time. 
The  WSMR/NMSU  effort  uses  multiple  high-speed  processors,  each  dedicated 
to  a single  task,  to  accomplish  this  objective. 


Another  problem  Is  data  storage.  Since  an  event  m^^  occur  (unexpected 
breaking-up  of  a missile,  etc.)  which  mj^  make  the  actual  image  very  Im- 
portant in  addition  to  the  information  normally  beino  extracted,  it  is  use 
ful  to  store  the  video  on  a video  tape  recorder  (VTR).  To  satisfactorily 
reconstruct  the  tracking  problem  such  information  as  timing  markers,  shaft 
angle  encoder  information,  equipment  status,  zoom  and  rotation  drive  set- 
tings, etc.  is  needed  in  addition  to  the  video.  This  information  is  best 
stored  together  with  the  video  on  the  VTR.  WSMR  has  solved  this  problem 
by  developing  an  interfacing  data  inserter  that  inserts  this  information 
into  several  lines  of  video  available  in  the  vertical  retrace  interval. 
This  information  is  then  stored  with  the  video  in  a fashion  that  is  readi- 
ly recoverable  if  needed.  A whole  mission  as  viewed  by  the  instrument  may 
be  reconstructed  from  one  such  video  tape. 


Conclusions 


New  methods  to  increase  the  utility  of  optical  systems  are  being 
developed  at  a very  fast  rate.  Other  novel  developments  in  technology 
that  bear  careful  watching  are  optical  computers,  bragg  cell  technology 
using  bulk  acoustic  wave  devices,  and  on-chip  charge-coupled  device  proc- 
essing of  visual  information.  A system  that  todey  pushes  the  state-of- 
the-art  will  be  made  obsolete  by  tomorrow's  technology.  There  is  probably 
more  future  in  optical  systems  than  in  the  purely  electronic  tracking  sys- 
tems of  typical  range  instrumentation.  Ideas  such  as  those  discussed  in 
this  paper  are  but  stepping  stones  in  the  direction  of  truly  intelligent, 
automatic  optical  tracking  systems.  The  optical  engineer  has  a challenge 
before  him  of  string  up  with  a rapidly  changing  field  and  using  new  and 
powerful  technology  in  the  solution  of  optical  tracking  problems. 
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Advancements  in  technology  over  the  past  decade  have 
opened  doors  for  accomplishing  computational  tasks  that. were  not 
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INTRODUCTION:  Optical  tracking  has  been  a mainstay  of  accu- 
rate metric  range  instrumentation  since  the  first  testing  of  modern 
rocketry.  The  accuracies  that  were  possible  from  optical  instruments 
exceeded  those  from  other  available  instruments.  Improvements  in  en- 
coders, optical  testing,  modelling  of  the  atmosphere,  and  optica!  design 
continuously  improved  the  accuracies  of  optical  instruments.  The  major 
drawback  is  the  required  film  processing  which  delayed  the  delivery  of 
boresight  corrected  optical  data. 

Recent  changes  in  technology  have  created  the  potential  for 
relieving  part  of  the  delay  in  data  delivery.  Automatic  tracking 
methods  using  high-speed  microprocessors,  artificial  intelligence,  and 
pattern  recognition  techniques,  together  with  special  modifications  to 
the  existing  optical  systems,  are  now  available  to  perform  most  of  the 
film  reading  function  in  an  on-line,  real-time  mode.  These  methods  far 
exceed  the  conventional  contrast,  edge,  and  correlation  trackers  in 
sophistication  and  capability,  since  they  are  based  upon  an  understand- 
ing of  some  definable  properties  of  the  image  involving  many  parameters 
as  compared  to  only  a few. 

THE  INTELLIGENCE  OF  OBJECT  IDENTIFICATION:  Pattern  recogni- 
tion is  a mathematical  science  based  upon  the  separation  of  a parameter 
space  into  two  or  more  regions,  so  that  when  the  parameter  is  measured 
it  may  be  classified  as  belonging  to  one  of  the  appropriate  regions. 

It  follows  that  a vector  parameter  will  give  rise  to  a parameter  space 
of  dimensionality  equal  to  the  number  of  independent  elements  in  the 
vector.  Thus,  for  an  N-vector,  the  required  separation  is  a hyperplane 
. In  N space.  If  the  parameter  is  a single  element  vector,  an  assignment 
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can  be  made  on  the  basis  of  a single  threshold  on  the  real  numbers,  and 
a tracker  can  be  built  that  uses  this  decision  rule.  An  example  we  call 
a contrast  tracker  uses  a threshold  on  brightness  for  the  assignment. 

A preprocessing  algorithm  may  be  plac^  before  the  decision.  If  we  pre- 
process  for  the  magnitude  of  change  in  Intensity,  the  same  thresholding 
rule  will  yield  an  edge  tracker.  These  are  amongst  the  simplest  appli- 
cations of  pattern  recognition  to  the  object  identification  problem. 
Since  these  algorithms  are  easily  confused,  many  spurious  objects  In  the 
field  of  view  (FOV)  often  meet  the  classification  criteria. 

A somewhat  different  approach  that  uses  an  array  of  points  and 
measures  the  closeness  of  fit  to  a subsequently  measured  similar  array, 
while  choosing  the  best  match  as  the  correct  location,  is  generally 
known  as  a correlation  tracker.  The  decision  1s  again  based  upon  a sin- 
gle element  parameter  vector  (the  closeness  of  fit),  but  the  preprocess- 
ing of  data  Is  much  more  elaborate.  While  this  approach  offers  an 
Improvement  In  confidence  that  the  correct  object  has  been  located  if 
the  object  description  Is  known.  It  suffers  from  two  principal  problems. 
The  first  Is  that  generally  the  object  being  tracked  changes  appearance 
continually  while  objects  In  the  background  may  not.  This  requires  an 
adaptive  object  description  which  may  slowly  converge  to  the  acceptance 
of  an  undesired  object  as  the  desired  one.  The  second  problem  is  that 
this  approach  requires  a very  large  amount  of  processing  to  do  a good 
job,  since  the  optimal  linear  process  would  be  a convolution  of  the  NxM 
object  description  array  over  the  PxQ  array  of  data  points,  and  gener- 
ally P»N,  Q»M.  A commonly  used  simplification  is  an  additive  (sub- 
tractive) algorithm  that  seeks  the  best  fit  of  the  desired  array  to  the 
data.  Instead  of  the  convolution.  This  approach  necessarily  results  In 
loss  of  tracker  performance.  For  these  reasons,  the  correlation  track- 
ing method  Is  generally  limited  to  very  restricted  window  tracking  and 
fairly  slow  update  rates. 

Approaches  to  real-time  optical  tracking  have  generally  been 
limited  to  these  approaches  for  the  following  principal  reasons.  The 
first  and  most  Important  has  been  the  magnitude  of  the  real-time  proc- 
essing requirement  for  the  more  elaborate  approaches,  which  have  exceed- 
ed computational  resources  generally  available.  The  second  has  been  a 
lack  of  Image  understanding  that  would  allow  the  formulation  of  more 
reliable,  yet  simple,  approaches.  Substantial  progress  has  recently 
been  made  In  the  former,  and  there  are  many  encouraging  new  developments 
In  the  latter. 

A variety  of  methods  of  Image  data  processing  have  become 
known  over  the  past  decade.  Appllcatlons-orlented  research  at  the  US 
Army  White  Sands  Missile  Range  (WSMR)  has  lead  recently  to  a system  of 
reasonably  high  sophistication  using  concepts  developed  Iri-house  and 
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through  sponsored  research  to  solve  complex  Identification  and  track- 
ing problems.  USMR  has  concentrated  on  objects  In  the  visible  spectrum 
and  In  real-time.  Many  other  systems,  not  necessarily  real-time,  have 
been  developed  for  applications  in  medicine,  meteorology,  and  space 
research. 

Many  of  the  newer  methods  Involve  the  use  of  many  elements  In 
the  parameter  vector  to  glean  more  Information  from  the  data.  In  apply- 
ing pattern  recognition  methods  to  the  object  Identification  problem, 
the  engineer  Is  trying  to  minimize  the  amount  of  data  he  must  handle  and 
maximize  his  confidence  that  he  made  the  correct  decision.  Any  linear 
process  will  preserve  the  quantity  of  data  (260,000  points  for  a 512x512 
Image,  possibly  8 bits  per  point)  which  is  obviously  not  desirable  If 
much  processing  is  required  to  make  a decision.  The  engineer  Is  forced 
to  require  a high  degree  of  parallel  processing  on  linear  processes,  and 
to  perform  nonlinear  operations  to  reduce  the  data  quantity  prior  to 
determining  the  values  of  the  parameter  elements  used  in  the  decision 
rule.  Ideally,  the  dimensionality  of  the  decision  space  should  be  kept 
reasonably  small  to  allow  decisions  to  be  made  In  real-time  or  In  near 
real-time. 


Some  of  the  preprocessing  methods  currently  In  use  are: 

Filtering:  Filtering  operations  generally  involve  the  con- 
volution of  a point  spread  function  array  with  the  image  to  achieve  some 
desired  objective  with  the  image.  Examples  include  removing  spatially 
Invariant  degradations  due  to  the  optics  of  the  atmosphere,  boosting  the 
high  frequency  content  of  the  Image  to  enhance  edges,  removing  noise  In 
the  Image,  making  the  image  more  pleasing  to  the  eye,  and  other  such 
operations.  Generally  those  operations  which  remove  degradations  are 
called  estimation  and  those  which  emphasize  certain  spatial  frequencies 
or  certain  aspects  of  the  image  are  called  enhancement.  It  must  be 
noted  that  enhancement  is  an  intentionally  introduced  distortion  to  pro- 
duce some  desired  effect. 

Transforms:  Operations  that  map  the  image  into  a new  domain 
are  called  transforms.  The  elements  In  the  new  domain  are  a measure  of 
some  property  of  the  original  Image.  The  most  common  example  Is  the 
discrete  Fourier  transform  (OFT),  especially  In  the  fast  algorithms 
(FR).  The  DR  1d<!nt1f1es  the  spatial  frequency  content  of  the  image, 
t^lch  allows  further  processing  based  upon  these  components.  A class 
of  binary  Fourier  (BIFORE)  transforms  has  been  developed  over  the  past 
decade  which  are  similar  to  the  OFT  but  are  more  suited  to  computer 
applications.  These  might  be  called  lesser  transforms  since  they  do 
not  represent  the  Information  In  the  Image  as  completely.  Because  they 
are  much  more  efficiently  run  on  a computer  than  the  DFT,  'they  have 
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Inportant  applications  In  Image  transformations.  Among  these  lesser 
transforms  are  the  now  popular  Hadamard  transform  based  upon  Walsh  func- 
tions, and  the  less  known  but  simple  Haar  transform.  These  transforms 
can  be  useful  for  Identifying  features  of  Interest  In  the  Image.  It  Is 
necessary,  of  course,  to  apply  all  of  these  transforms  in  a two-dimen- 
sional algorithm  to  process  the  two-dimensional  Images. 

Point  Processing:  In  point  processing.  Individual  points  In 
the  Image  are  assigned  new  values  based  upon  some  assignment  rule.  This 
may  take  a variety  of  forms  with  a large  variation  In  apparent  results. 
One  point  processing  algorithm  averages  the  corresponding  point  of  sev- 
eral frames  or  sequential  Images  to  produce  a weighted  composite  and 
remove  transient  degradations.  Another  assigns  all  values  above  a given 
threshold  to  1 and  all  values  below  to  0.  This  is  known  as  threshold- 
ing. A variation  on  thresholding  is  to  assign  predetermined  gray  levels 
to  1 even  though  these  may  not  be  in  a continuous  range.  Still  another 
algorithm,  known  as  contrast  stretching,  assigns  all  values  below  some 
Intensity  Iq  to  0;  all  values  above  another  Intensity  Ii  to  the  maximum 
gray  level,  say  256;  and  stretches  the  intermediate  values  to  occupy  the 
full  range.  Generally,  point  processing  methods  are  nonlinear,  yielding 
fewer  bits  In  the  output  than  In  the  data  array. 

The  next  step  In  the  process  Is  to  identify  the  values  of  the 
elements  In  the  parameter  vector.  These  elements  may  include  such 
things  as  size,  orientation,  number  of  corners,  brightness,  etc.  When 
joined  In  a single  parameter  vector,  they  describe  all  we  think  we  need 
to  know  to  adequately  describe  the  object  for  purposes  of  identification. 

A REAL-TIME  TRACKING  SYSTEM;  By  using  the  above  concepts  to- 
gether with  high-speed  microprocessors  and  special  optics,  a real-time 
tracking  system  may  be  devised  that  demonstrates  a substantial  advantage 
over  the  contrast,  edge,  and  correlation  trackers  currently  on  the  mar- 
ket. The  greatest  challenge  is  that  of  doing  "intelligent"  processing 
of  video  data  at  the  extremely  high  data  rates  of  standard  TV. 

The  development  of  an  Intelligent  real-time  video  (RTV)  track- 
ing system  has  been  accomplished  through  the  cooperative  efforts  of 
research  and  development  personnel  at  WSMR,  New  Mexico  State  University 
(NMSU),  and  the  Optical  Sciences  Center  of  the  University  of  Arizona. 

The  prototype  RTV  processor  is  being  assembled  at  NMSU,  the  automatic 
zoom  lens  and  image  rotator  at  the  University  of  Arizona,  and  the  system 
Interfaces  at  WSMR.  The  system  components  will  be  Integrated  and  the 
system  deployed  early  In  fiscal  year  1979  as  an  add-on  modification  to 
the  Contraves  Model  F cine theodolite  at  WSMR. 

Figuire  1 Is  a block  diagram  of  the  RTV  tracking  system  which 
shows  the  RTV  processor  as  the  central  element.  The  RTV  processor 
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FIQURE  1.  RTV  TRACKING  SYSTEM 
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receives  standard  composite  video  from  a television  camera,  locates  the 
target  Image,  and  provides  control  signals  which  drive  the  zoom  and 
Image  rotation  elements  and  point  the  Contraves  tracking  optics  at  the 
target.  It  also  provides  boresight  correction  signals  and  target  atti- 
tude angles  which  are  recorded  into  the  vertical  retrace  period  of  the 
video  tape  used  to  record  the  tracking  sequence. 

The  RTV  processor  consists  of  a distributive  array  of  five 
processors,  shown  in  Figure  1.  The  video  processor  synchronizes  and 
digitizes  the  video  Signal  from  the  TV  camera,  performs  a statistical 
analysis  of  the  digitized  image,  and  separates  the  target  images  from 
the  background.  The  projection  processor  accumulates  binary  projections 
of  the  target  and  plume  images  and  establishes  the  structural  parameters 
which  locate  and  describe  the  shape  of  the  target  and  plume  images.  The 
tracker  processor  establishes  a structural  confidence  in  the  data  and 
implements  an  intelligent  tracking  strategy.  The  control  processor 
utilizes  the  structural  confidence  to  combine  current  target  coordinates 
with  previous  target  coordinates  to  orient  the  optics  toward  the  next 
expected  target  position,  forming  a fully  automatic  system.  The  input/ 
output  (I/O)  processor  provides  a user  interface  to  the  tracking  proc- 
essors and  is  responsible  for  recording  the  tracking  data  with  a video 
tape  recorder. 

A Research  Oriented  Processor  Configuration:  Four  of  the  five 
distributive  processors  (excluding  the  I/O  processor)  which  comprise  the 
RTV  processor  are  high-speed  microprogrammable  processors,  each  of  which 
requires  a stored  microprogram  to  control  its  designated  tracking  func- 
tion. To  provide  a powerful  tool  for  future  research  in  video  tracking 
algorithms  and  to  facilitate  operational  testing  of  the  RTV  system,  the 
control  store  of  each  processor  is  realized  with  a read/write  random 
access  memory. 

These  four  distributive  processors  are  being  built  with  a 
standard  mlcroprogranmatle  processor  architecture  to  simplify  the 
development  and  maintenance  of  the  RTV  tracking  system.  This  standard 
architecture  has  been  designed,  built,  and  tested  at  NMSU.  Based  on  the 
new  Texas  Instruments  (TI)  74S481  Schottky  processor  chip,  it  provides 
a microinstruction  cycle  time  of  under  200  nanoseconds  with  sufficient 
computational  power  to  implement  the  required  RTV  tracking  algorithms. 
The  standard  architecture  requires  several  LSI  chips  which  may  be  parti- 
tioned Into  control  and  processing  sections.  Overlapping  the  execution 
of  one  microinstruction  with  the  fetch  of  the  next  one  allows  the  proc- 
essor to  achieve  a minimum  microinstruction  cycle  time  equal  to  the 
larger  of  either  the  fetch  time  or  the  execution  time,  significantly 
Increasing  the  speed. of  the  processor. 
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The  four  high-speed  processors  included  in  the  RTV  tracking 
'loop  are  described  in  some  detail  in  the  following  paragraphs.  In  each 
case,  the  processor  is  built  around  the  standard  architecture  outlined 
above.  Some  specialized  hardware  is  added  to  the  standard  configuration 
in  each  case  to  accommodate  the  specific  functions  of  the  individual 
processors. 

The  Video  Processor:  The  video  processor  decomposes  each 
video  field  into  target,  plume,  and  background  pixels  at  the  standard 
video  rate  of  60  fields  per  second.  As  the  TV  camera  scans  the  scene, 
the  video  intensity  is  digitized  at  m equally  spaced  points  across  each 
horizontal  scan  line.  A resolution  of  m = 512  pixels  per  line  results 
in  a pixel  rate  of  96  nanoseconds  per  pixel.  Within  96  nanoseconds,  a 
pixel  intensity  is  digitized  and  quantized  into  8 bits  (256  gray  levels), 
counted  into  one  of  six  256-level  histogram  memories,  and  then  converted 
by  a decision  memory  to  a 2-bit  code  indicating  its  classification  (tar- 
get, plume,  or  background).  The  2-bit  classification  code  is  passed  to 
the  projection  processor  via  the  target  data  (TD)  and  projection  data 
(PD)  lines.  TD  is  high  for  target  points;  PD  is  high  for  plume  points. 

The  basic  assumption  of  the  image  decomposition  method  is  that 
the  target  image  has  some  video  intensities  not  contained  in  the  immedi- 
ate background.  A tracking  window  is  placed  about  the  target  image,  as 
shown  in  Figure  2,  to  sample  the  background  intensities  inmediately 
adjacent  to  the  target  image.  The  window  frame  is  partitioned  into  two 
regions,  B and  P.  Region  B is  used  to  provide  a sample  of  the  back- 
ground intensities,  and  region  P is  used  to  sample  the  plume  intensities 
when  a plume  is  present.  Using  the  sampled  intensities,  a very  simple 
decision  rule  is  used  to  classify  the  pixels  in  region  T as  follows: 

• Background  points--All  pixels  in  region  T with  intensities 
found  in  region  B are  classified  as  background  points. 

• Plume  points— All  pixels  in  region  T with  intensities  found 
in  region  P,  but  not  found  in  region  B,  are  classified  as 
plume  points. 

• Target  points— All  pixels  in  region  T with  intensities  not 
found  in  either  region  B or  P are  classified  as  target 
points. 

A tracking  window  placed  about  the  target  image  provides  a 
method  for  sampling  the  pixel  features  associated  with  the  target  and 
background  images.  The  background  sample  should  be  taken  relatively 
close  to  the  target  image,  and  it  must  be  of  sufficient  size  to  accu- 
rately characterize  the  background  intensity  distribution -in  the  vicini- 
ty of  the  target.  The  tracking  window  also  serves  as  a bandpass  filter 
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FIGURE  2.  TRACKING  WINDOW 


ty  restricting  the  target  search  region  to  the  imnediate  vicinity  of  the 
target.  Although  one  tracking  window  is  satisfactory  for  tracking  mis- 
sile targets  with  plumes,  two  windows  provide  additional  reliability  and 
flexibility  for  independently  tracking  a target  and  plume,  or  two  tar- 
gets. Having  two  independent  windows  allows  each  to  be  optimally  con- 
figured and  provides  reliable  tracking  when  either  window  can  track. 

The  Projection  Processor:  The  projection  processor  consists 
of  a projection  accumulation  memory  (PAM)  and  a standard  processor  which 
are  designed  to  form  projections  of  simultaneous  target  and  plume  win- 
dows and  to  compute  structural  parameters  from  the  projections.  The 
pixel  data  from  each  tracking  window  enters  the  PAM  in  real-time  as  a 
synchronized  serial  stream  on  lines  TD  and  PD.  As  the  classified  pixel 
data  is  received,  the  P.AM  accumulates  the  projection  data  while  the 
processor  monitors  the  y- projections,  accumulates  the  total  number  of 
target  and  plume  points,  and  determines  the  midpoints  used  to  split  the 
X- projections.  Each  x-projection  is  split  to  allow  the  computation  of 
target  and  plume  attitude  angles  based  on  the  locations  of  the  median 
canters  of  the  x-  and  y-projections  of  the  top  half  and  bottom  half  of 
tlia  target  and  plume  images. 
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During  the  vertical  retrace  interval,  the  projection  processor 
xlivides  each  projection  into  eight  segments  of  equal  mass  using  a simple 
algorithm  to  sequentially  address  each  line  of  the  projection  and  multi- 
ply the  number  of  pixels  in  the  line  by  eight.  If  the  result  exceeds 
the  total  number  of  pixels  in  the  projection,  a flag  is  sent  to  the  PAM 
forcing  the  next  line  to  be  placed  at  the  beginning  of  the  next  1/8  seg- 
ment of  the  projection.  If  the  result  is  less  than  the  total  number  of 
pixels  in  the  projection,  additional  lines  of  pixels  are  accimiulated 
until  the  line  containing  the  1/8  percentile  point  is  located. 

The  1/8  percentile  points  for  each  of  the  six  projections  are 
computed  within  410  ysec  of  the  vertical  retrace  period  and  then  passed 
to  the  communication  memory  along  with  the  total  number  of  target  and 
plume  points.  These  parameters  constitute  the  structural  parameters 
used  by  the  tracker  processor  to  define  an  intelligent  tracking  strategy. 
Figure  3 illustrates  the  accumulation  of  the  projections  and  the  compu- 
tation of  the  percentile  points  and,  for  simplicity,  omits  the  splitting 
of  the  x-projection. 

Tracker  Processor:  The  tracker  processor  receives  the  struc- 
tural parameters  from  the  projection  processor,  locates  and  character- 
izes the  structure  of  the  target  and  plume  images,  and  decides  on  a 
tracking  strategy  to  maintain  track.  It  then  outputs  control  signals  to 
place  the  window  frames  in  the  video  processor  and  outputs  target  loca- 
tion and  orientation  data  to  the  control  processor  along  with  a confi- 
dence in  the  measured  data.  Since  it  operates  on  the  projection  data 
from  field  n while  the  projections  for  the  next  field  (n+1)  are  being 
accumulated,  the  tracker  processor  is  always  one  field  behind  the  video 
and  projection  processors.  The  tracker  and  control  processors  must  both 
finish  their  calculations  before  the  vertical  retrace  interval  begins 
for  field  n+1.  This  constraint  requires  the  tracker  processor  to  output 
its  data  to  the  control  processor  within  7 milliseconds  after  it  receives 
the  projection  data. 

Since  the  tracker  processor  is  the  only  processor  that  com- 
irajni cates  with  all  of  the  other  three  processors,  each  of  which  has  its 
own  coordinate  system,  the  tracker  processor  must  interpret  the  input 
data  intelligently  and  then  output  the  appropriate  data  to  the  video  and 
control  processors  in  their  respective  coordinate  systems.  The  inputs 
are  positive  16-bit  integers  defined  for  a coordinate  system  whose  ori- 
gin is  the  first  pixel  scanned  inside  the  appropriate  tracking  window. 

The  outputs  to  the  video  processor  are  9-bit  positive  integers  defined 
for  a coordinate  system  whose  origin  is  the  first  pixel  scanned  within 
the  FOV;  The  16-bit  outputs  to  the  control  processor  are  defined  for  a 
coordinate  system  whose  origin  is  the  boresight. 
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FIGURE  3 

PROJECTIONS  AND  PERCENTILE  POINTS 

An  overall  view  of  the  functions  of  the  tracker  processor  is 
given  in  Figure  4.  It  has  two  modes  of  operation,  the  initial  acquisi- 
tion mode  and  the  autotrack  mode.  The  initial  acquisition  mode  is  used 
when  the  RTV  system  is  trying  to  lock  onto  the  target  of  interest.  Dur- 
ing this  mode,  the  video  processor  does  little  or  no  learning  on  the 
target  and  plume  intensities.  The  tracker  processor  will  not  instruct 
the  control  processor  to  begin  predicting  the  target  location  until  it 
Is  sure  of  the  existence  of  at  l^ast  the  plume  within  the  plume  window. 
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FIGURE  4.  TRACKER  PROCESSOR  FUNCTIONS 


When  the  plume  image  moves  into  an  appropriate  region  of  the  FOY,  the 
tracker  processor  will  notify  both  the  video  processor  and  the  control 
processor  with  a flag  indicating  that  it  is  now  ready  to  shift  into  the 
auto track  mode. 

The  autotrack  algorithm  is  divided  into  the  four  main  modules 
shown  in  Figure  4.  The  data  conversion  module  transforms  the  projection 
input  data  into  physical  variables;  such  as,  target  and  plume  size, 
position,  and  shape.  These  variables  are  then  combined  with  previous 
target  activity  data  from  the  history  update  module  to  obtain  additional 
variables;  such  as,  the  changes  in  target  and  plume  position  and  size. 
All  of  these  variables  are  compared  with  preassigned  reference  constants 
to  obtain  a set  of  binary  inputs  which  are  used  directly  by  the  state 
interpretation  module  to  define  the  current  tracking  situation  and  pro- 
duce an  optimum  tracking  strategy.  The  strategy  is  implemented  by  the 
output  computation  module  in  the  form  of  control  signals  to  the  video 
and  control  processors. 

The  Control  Processor:  The  function  of  the  control  processor 
is  to  generate  the  four  control  signals  that  drive  the  real-time  video 
tracker;  i.e.,  the  tracker  azimuth  and  elevation  E^  which  are  sent  to 
^e  RTV-Contraves  system  interface  and  the  optics  rotation  and  zoom 
which  are  sent  to  the  RTV- zoom/ rotation  interface  (Figure  1).  In 
aMition,  the  control  processor  outputs  the  following  tracking  data  to 
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the  I/O  processor  after  each  field  so  they  can  be  recorded  in  the  verti- 
t:a1  retrace  period  of  the  video  tape:  field  count t tracker  status* 
time,  x-displacement  from  boresight.  y-displacement  from  boresight,  tan- 
gent of  the  target  orientation  angle  from  vertical  boresight,  target 
azimuth,  target  elevation,  tracker  azimuth,  tracker  elevation,  image 
rotation  angle,  and  zoom  ratio. 

The  tracking  optics  feeds  the  target  image  to  the  video  proc- 
essor portion  of  the  RTV  processor  (Figure  1)  which  establishes  the  tar- 
get coordinates  with  respect  to  the  optics  boresight.  The  control 
processor  coirbines  current  target  coordinates  with  previous  target  coor- 
dinates to  point  the  optics  toward  the  next  expected  target  position. 

The  predicted  control  equations  are  based  on  the  combination  of  linear 
and  quadratic  optical  estimates  taken  from  a five-deep  history  stack. 
Since  the  input  data  is  derived  from  field  (K-l),  and  the  estimates  are 
being  computed  during  field  K,  the  control  estimates  must  predict  ahead 
two  time  increments  to  provide  control  signals  which  will  place  the 
boresight  at  the  correct  position  during  frame  K+1. 

COMPUTER  SIMULATION  OF  THE  REAL-TIME  VIDEO  TRACKER.  A compu- 
ter simulation  of  the  RTV  tracking  system,  incorporating  the  algorithms 
used  in  the  control  stores  of  the  four  distributive  processors,  has  been 
developed  and  implemented  on  the  POP  11/35  system  at  WSMR.  The  purpose 
of  this  simulation  is  to  provide  a method  for  testing  new  design  con- 
cepts and  evaluating  the  RTV  tracking  system  under  realistic  tracking 
conditions.  The  simulation  model  includes  dynamic  models  for  the  target 
trajectory  and  the  Contraves  Model  F cinetheodolite  tracking  system,  in 
addition  to  the  RTV  processor  algorithms,  for  simulating  the  complete 
tracking  system.  The  recent  development  of  an  image  processing  labora- 
tory at  WSMR  has  enabled  research  personnel  to  digitize  sequential  video 
fields  of  typical  tracking  imagery.  These  fields  of  digitized  video  are 
now  being  used  in  the  RTV  simulation  and  in  the  development  of  improved 
image  segmentation  and  structural  analysis  algorithms. 

The  RTV  simulation  is  being  used  as  a research  tool  at  WSMR. 

It  is  especially  effective  in  evaluating  the  RTV  system  performance  and 
in  identifying  and  seeking  solutions  to  real-time  tracking  problems 
before  the  RTV  tracking  system  is  deployed.  With  the  added  capability 
of  using  digitized  video  from  a variety  of  tracking  sequences  as  inputs 
to  the  video  processor,  the  simulation  can  now  test  the  system  perform- 
ance under  a variety  of  tracking  conditions,  thus  allowing  thorough 
evaluation  and  possible  refinement  of  the  tracking  and  processing  algo- 
rithms and  the  state  transitions  of  the  tracker  processor. 

CONCLUSION:  RTV  tracking  Is  not  new,  but  recent  developments 
have  added  new  capabilities  that  enhance  the  advantages  of  these  sys- 
tems. Video  tracking  offers  some  distinct  advantages  over  electronic 
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tracking  (such  as  ECM  Imnunlty),  but  suffers  from  some  disadvantages  as 
Well  (such  as  restrictions  In  visibility).  Several  other  aspects  of 
system  development  for  RTV  tracking  are  discussed  In  the  papers  and 
reports  listed  In  the  bibliography. 

A continuing  research  need  exists  for  better  understanding  of 
Imagery.  A human  Incorporates  many  elements  into  the  parameter  vector 
that  he  uses  to  Identify  an  object.  The  difficulty  of  understanding  the 
hunan  visual  process  has  caused  rather  slow  progress  in  teaching  compu- 
ters to  "see."  We  know  that  the  human  uses  such  things  as  texture,  ori- 
entation, color,  size,  shading,  shape,  context,  etc.  to  identify  objects. 
Parameter  vectors  which  incorporate  these  elements  are  difficult  to 
quantify.  It  is  not  necessary,  however,  to  require  the  computer  to  see 
the  same  things  a hunan  does.  It  is  difficult  to  visualize  elements  of 
parameter  vectors  that  do  not  have  a physical  meaning  to  a human,  but 
which  may  be  useful  for  computer  recognition  processes.  Much  work  re- 
mains to  be  done  to  produce  a highly  sophisticated  sight  process  in  a 
conputer. 


The  concepts  described  in  this  paper  have,  however,  been  test- 
ed and  will  result  in  a prototype  system  depToyed  in  1979.  Through  a 
process  of  simulation  and  breadboard  verification,  WSMR  has  determined 
that  such  a system  is  well  within  the  current  capabilities  of  technolo- 
gy. A great  deal  of  national  (and  some  international)  attention  has 
been  focused  on  this  project  because  of  the  unique  applications  of  pat- 
tern recognition  in  a tracking  situation. 
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